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FRACTIONAL REPLICATION FOR MIXED SERIES 


Mitton Morrison* 


Experimental Towing Tank, Stevens Institute of Technology 


1. Introduction 


The type of experimental designs to be described in this paper 
was necessitated by a situation which often occurs in engineering 
research. In such situations, a considerable number of independent 
variables are involved, and project engineers are accustomed to pre- 
senting their results in the form of large numbers of families of curves 
obtained by keeping all independent variables constant except one. 
Since each independent variable (factor) is assigned several values 
(levels) the number of test points to be obtained becomes large, some- 
times forbiddingly so. Virtually all the test saving experimental 
designs which appear in statistical literature require either that the 
factors involved are at a convenient number of levels (for example, 
much has been written on designs in which all factors are at the same 
number of levels), or that some or all of the two factor interactions are 
zero. These conditions rarely obtain in engineering experimentation 
and hence the standard designs cannot be used except in special 
situations. 

However, the method known as fractional replication gives promise 
of alleviating this situation. Using fractional replication, it is possible, 


under certain conditions, to test only a portion of all the possible combi- — 


nations of levels of factors; and yet be able to test hypotheses on the 
existence of all main effects and all two factor interactions. What is 


*This work was sponsored by the Office of Naval Research. 
A portion of this paper was presented at the New York University meeting of the Section on 
Physical and Engineering Sciences of the American Statistical Association, May 1955. 
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more important from the point of view of the test engineer is that it is 
also possible to estimate all main effects and two factor interactions and 
hence estimate that portion of the observations which have not been 
made under the plan, so that if he chooses to present results graphically, 
all points obtained, either by actual tests or by estimation, can be 
plotted. 

Unfortunately, however, the method of fractional replication is 
strongest when all factors are at two levels, and such situations are not 
too frequently encountered. However, in the next section, a method 
will be described whereby fractional replication can be used to ad- 
vantage even when all factors are not at the same number of levels, 
under conditions which often exist in experimentation performed for 
engineering research programs, and indeed in other fields also. These 
conditions are: 


a. When the primary interest is in reducing the number of observa- 
tions required in a complete replication. 

b. When three-factor and higher order interactions can be con- 
sidered to equal zero. 

c. When the observations are independent and their distributions 
have equal variances. 

d. When there are at least five factors. The development which 
follows will be mostly in connection with the five factor situation 
but the method is applicable virtually without alteration when 
there are more than five factors. The method could be used with 
fewer than five factors, but the considerable confounding of main 


effects and interactions in such situations would weaken its 
effectiveness. 


2. The Procedure 


The method requires that the total number of possible data points 
be expressed in the form: z,(2") + 2,-1(2""") + +--+ + 2,(2) + 2 , where 
2o equals 0 or 1 depending on whether there are an odd or even number 
of possible observations. For all groups in which the exponent is equal 
to or greater than five, a straightforward fractional replication is 
possible which does not demand that any two factor interaction be 
assumed to equal zero. By means to be described, half the observations 
associated with the remaining terms (except 2p , if 2) = 1) can be dis- 
pensed with. The procedure will be illustrated by an example. 

Assume there are five independent variables, four at 2 levels and 
one at 3 levels. If all possible combinations are tested, there would be 
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(2) (2) (2) (2) (3) = 48 observations. Write: 48 = (2) (2) (2) (2) (2 + 1) 
and then to preserve the association between each number and the 
factors and levels it represents, write: 


2(a1a2) -2(b, be) -2(c,c2) -2(d,dz) -[2(@:e2) + 1(es) | (i) 
Carrying out the formal multiplication in (i) we obtain 


2°(a,42)(b, be) (Cr¢2) (did) (€,€2) 4: 2*(a1A2) (by bs) (Cres) (dda) (és) (ii) 


The total number of observations, 48, can be expressed ag 48= 1-2° + 
1-2*. Thus, in this example, z; = z, = 1; 2, = @ = 2, = % = 0. 

The first term in (ii), 2° (a,a2) (b:b2) (€:¢2) (did) (e,e2) corresponds 
to the 2° observations which can be obtained by taking all possible 
combinations of A, B, C, and D and the first two levels of H. Then 
for this portion of the design, a half-replication which permits estimation 
of all main effects; A, B, C, D, E; and two-factor interactions AB, AC, 
AD, AE, BC, BD, BE, CD, CE, DE (higher order interactions assumed 
to be zero) would be specified by the first 16 rows of (iii). The obser- 
vations thus specified were selected by making the ABCDE interaction 
the defining contrast, as described in Reference [1.] 

Next consider the second term in (ii), 2* (a,a2) (b,b2) (e:¢2) (did2) 
(e;). This term corresponds to the 2* observations which can be obtained 
by taking all possible combinations of the a,a, ; bib. 3 cc. ; did, ; at 
e = e;. To choose half these observations, suppose first we had the 
problem of testing or estimating all observations which could be made 
for all combinations of A, B, C, and D and the second and third levels 
of E; which, of course, includes the second term in (ii) as a one-half 
subset. Then a one-half replication would be the same as that performed 
for the first term in (ii) except that e, would be replaced by e, . However, 
€, appears in only 8 of the 16 observations in (iii). Hence, it would be 
necessary to make only 8 additional observations in order to obtain by 
observation and estimation all 16 combinations represented by the 
third term in (11). 

Thus, from half the observations corresponding to the first and 
second terms of (ii), the other half could be estimated. The method is 
easily extended to situations where the factors are at higher levels than 
for the experimental situation discussed above, or for situations where 
there are more than five factors. The designs obtained under the above 
described procedure will be called “estimation” designs. 

The design has been formed using a method of estimation as the 
motivating element. This method of estimation has much to recommend 
it, though it differs from the traditional procedure. Once the design is 


4 BIOMETRICS, MARCH 1956 


made, however, one need not retain the method of estimation but can 
instead form least squares estimates. 

One-half the observations associated with 2* (a,a2) (bbs) (C,C2) 
(d,dz) (e3) have been chosen by the means described above and so under 
this design, the experimental conditions at which observations are to 
be taken are: 


2 
o 
° 
Q 
® 


(iii) 


EBENONNFREPEPNNFP RPE NNNRFENNNHP EYED 
NENNFPREPNHENFNNRFPRPNYFPNEFNNYFRNE 
NNEPNFPNFPEPNNHEFNEFNRFPRYPNNEPNHE NEE 
NDNNNFNRFPRFPREPNNNHFNKFPRFPRFPNNNRENHPRE 
WHWWWWWWWNNNNNNNNRRP RRP RP Pee 


This, then, describes the manner in which the design is made. Had 
there been a 2° term, four more observations would be dictated by 
the procedure, and similarly for 2? and 2. If all factors are at an odd 
number of levels, an even half-replication will, of course, be impossible. 
To illustrate further the manner of setting up the designs, we include 
the designs for the 2* X 4 and 2° X 3 X 5 half replicates. 

For the 2* X 4 experiment we write: 


[2(a,a2) }[2(b1 be) ][2xc2) J[2(dids) I[2(exe2) + 2(@ses)] 


= 2°(aydz)(bibs) (cxe») (ddz) (€1€2) 4+ 2°(a,a2)(b1 bz) (Cic2) (die) (€s€4) 
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so that the observations to be taken are simply half-replicates of both 
terms as shown in (iv). 


25(aia 2) (byb2) (ciC2) (dids) (€:€2) 25(aia2) (bybe) (c1C2) (didz) (€3€4) 


21111 21113 
12111 12113 
11211 11213 
11121 11123 
11112 11114 
22211 22213 
22121 22123 
22112 22114 (iv) 
21221 21223 
21212 21214 
21122 21124 
12212 12214 
12221 12223 
12122 12124 
11222 11224 
22222 22224 


Note the symmetry of the design and the ease with which it is 
formed. This is characteristic of experimental situations in which all 
factors are at two levels. 

For the 2° X 3 X 5 experiment, we write: 


[2(a,a@2) ][2(b1b2) J[2(re2) ][2(did2) + ds][2(ere2) + 2(eses) + es] 
= 2°(a,02)(bi bs) (Cx€2) (Aide) (€x€2) + 2°(Ara2)(b1b2) (Cx€2) (did) (exes) 
+ 2*(aya2)(bib2)(Cr€2) (ds) (€x€2) + 2"(arae) (1 be) (Cr€2) (ds) (Cea) 
+ 2*(aya2)(by be) (Cr¢2)(dide)(€s) + 2°(ard2)(bibs) (Cx¢2) (ds) (es) 


The design for the half-replication is given by Table I. In this table 
subscripts obtained by the replacement process are in the same row. 

Columns (1) and (2) in Table I are ordinary half-replications; 
column (3) is obtained by assuming that it is required to find by obser- 
vation or estimation all observations associated with 2° (a,d2) (bbs) 
(c,€2)(d2d3)(€:€2). To do this one would need a half-replication con- 
sisting of sixteen observations, eight of which have already been made 
for column (1); the remaining eight are obtained from column (1) by 
using all observations in this column which have 1 in the fourth place. 
Change these 1’s to 3’s to give column (3). A similar procedure gives 
the remaining columns. 
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TABLE I 
Design for Half-Replication of 2? X 3 X 5 Experiment 


(1) (2) (3) (4) (5) (6) 
2 25 Py: 2 24 28 
(a1a2) (a1a2) (a:a2) (a1a2) (aia2) (a1a2) 
(bib2) (bibe) (bbe) (b:b2) (bib2) (b1b2) 
(cC2) (c1C2) (cyC2) (ciC2) (12) (a1 C2) 
(d,dz) (didz) ds ds (did2) ds 
(€1é2) (€3€4) (€1€2) (€3€4) & & 
21111 21113 21131 211383 21115 21135 
12111 12113 121381 12133 12115 121385 
11211 11213 11231 11233 11215 112385 
11121 11123 11125 
11112 11114 111382 11134 
22211 22213 22231 22233 22215 22235 
22121 22123 22125 
22112 22114 221382 22134 
21221 21223 21225 
21212 21214 212382 21234 
21122 21124 
12212 12214 12232 12234 
12221 12223 12225 
12122 12124 
11222 11224 
22222 22224 


3. Least Squares Estimates of the Effects 


Least squares estimates of the effects will be found for the 2* x 4 
experiment and in the development comments will.be made bearing 
on the results which would be obtained in five variable experiments in 
which the factors are at other levels. Some extensions, as for example 
experiments in which all factors are at an even number of levels, will 
be obvious. 

In the analysis, a function of many variables subject to many 
restrictions must be differentiated. The side restrictions will not be 
used until the differentiation has been carried out. If the solutions 
thus obtained satisfy the restrictions, they are the required solutions. 

In what follows, it should be understood, that wherever a sum- 
mation of observations is given over indicated subscripts, it means that 
the subscripts assume the combinations of values which correspond to 
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observations in the design; it does not mean that the summation is over 
all values of the gubsnripta. for this would be the case for a full repli- 
cation. The subscript } on the summation variables is a reminder of 
this. It also follows that the ensuing discussion does not hold for 
experiments in which all the factors are at an odd number of levels, for 
in that case a 3 replication is impossible. It is still possible to set up 
a design for Soh cases. However, finding the least squares estimates 
would be a formidable task, and the analysis of variance for the experi- 
ment would almost certainly not be elegant. 
Let 2i;::m represent an observation ; where 


Ee ee! 


Sa open we 

Sis ee 4 
re eee he 
m=1,2,:::,M 


Let iy BELA ath 


————— 


Q= a2: [Liszim — U— a; — By — Ye — 81 — Em 


(iikim)1/2 


ae, = (a8); Ney (ke) (2) im 
= By)iw — (88)32 — (Be) iw 
ea AO ere (16am Sasi 
oe (8€)im]” S36h = 
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Summing term by term, 
a@ = 3I1JKLM4a 


(iiklm)i/a 


&; = 4JKLMA4, + 3JKLM2A, = 0; 


(iiklm)1/a 
similarly for 


B; 5 Gs 


(iiklm)1/s 


YB). = KLMaBy + $KLMob 2 


(tiklm)i/a 
+ 4KLMoB» + $KLMaB» = 0; 
similarly for all other interactions. This gives 


Lijkim 
(ijklm)s/2 


zIJKLM 


a4= 


Note, in the above, that the step, SP aes 72 ©; = 0 depended on 
a, and a, being involved in the same number of observations. This 
could not be the case for a factor at two levels in an experiment in which 
one factor is at two levels, and all the others at an odd number of levels, 
for in such a case, a one-half replication would have an odd number of 
observations. 

Setting 0Q/da;, = 0, we obtain 


[Ls jntm Shika COs Re B; Gy 8, aca 
(iklm)i/9 
-_—™ ae ~~ Ae 
aa (a8) :+; a (ay) ir x (ad) ir a (QE) itm 


= (By) — (8);2 — Beim — (78) — (Fm — (dial = 0 
which leads to 
HOG os, 3JKLMa = tJ KLM32;, =a) 


(iklm) 1/2 
From this, 
ee LS Litjkim 2 ~ Ee A ae -: Wee ie 
tJKLM 4JKLM sIJKLM 
similarly for Bj, , Tx, 81+, mn « 
Obviously 
2 4: J 2 By. = 2 te = De by. — Sates = (0) 
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Setting 0Q/0(a8);-;, equal to zero, we obtain 


[Vij ktm = a Pie Qj —< B, = vk Fan b, a Em 
(kim)i/s2 
— (28) i537 — (@Y) ir — (a4). — (QE) 5m 
= (BY) jx — (85) j-1 a (Be) jm a (75) x2 aay (Ye)em ie (5€) 1m] = 0 


From this it follows that 


Li'5'kim are 3KLMU ~_ 4KLM2;, 


(klm)i/s 


-<—™ 
1K a 
= 4KLM§,, — }KLMG8) a = 0 
and 
LE af © Lrg kim > Lijkim 
aes © di (iiklm)1/2 
aB (fae eo Se eee ee 
(8) s+; 4KLM * . 4IJKLM 
Litikim S Lijtkim 
iklm)1/2 (iklm)1/2 oF 


1JKLM 1IKLM 


Analogous results are obtained for the other interactions, and it can also 
be verified that the restrictions on the interactions are satisfied. Note 
that the estimates are exactly what one would expect them to be on an 
intuitive basis. 

To investigate the orthogonality of the estimates, we form Table II 
which gives the least squares estimates for the 2* X 4 half-replication; 
the constant multiplier, 1/32, should be associated with each term. It is 
clear that the least squares estimates of the effects for any experiment in 
which all factors are at an even number of levels are given by formulas 
analogous to those on the previous pages. It is not obvious, however, 
when some of the factors are at odd levels, and it has already been shown 
that if too many factors are at an odd number of levels, the formulas 
do not hold at all. 


4. The Analysis of Variance 


The elegance of the analysis of variance for the designs will vary” 
considerably. Although no full investigation has been carried out, 
it appears that the more factors there are at an even number of levels, 
the neater the analysis. This will be demonstrated by a comparison 
of the 2* X 4 and 2* X 3 half-replications. 
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For the 2* X 4 half-replication, all main effects and two-factor 
interactions are orthogonal. Hence, the breakdown of the degrees of 
freedom is straightforward and the degrees of freedom for all main 
effects and two factor interactions can be isolated. The breakdown 
of the degrees of freedom is as follows: 


Effect 


lov 
in 


w& 
Q 
Let SO SO NO NO Ne 


Residual 


oo 
eS 


Total 


In Table III estimates of effects for a 2* X 3 experiment have been 
specified. Note that the following two-factor interactions are partially 
confounded. 


AB and OD 
AC and BD 
AD and BC 


All other main effects and two factor interactions are orthogonal. On 
further examination, however, it can be seen that confounded inter- 
actions are not linearly dependent, that is, e, are yee partially 

nfounded, and it will be possible to extract the sum of squares as- 
Soetad with the six degrees of freedom carried by AB, CD, AC, BD, 
AD and BC. Itis noteworthy that this permits a PERE mean square 
free of two factor interaction, though carrying very few degrees of 


freedom. 
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The breakdown of the degrees of freedom would be as follows: 


Effect 


ee 
ifs 


D 
E 
AB, CD, AC, BD, AD, and BC 


aa 
by 
WNONNN ON Ree Re 


Residual 


bo 
oO 


Total 


Of course, if it is known that three or more of the six interactions 
involving the two-level factors are zero, the single degrees of freedom 
for the remaining interactions can be listed separately. 

It should be mentioned that when the method of forming the designs 
leads to an analysis of variance so complicated as to be prohibitive, it 
is always possible to carry out analyses on convenient portions of the 
data which may be appropriate to the problem at hand. It is hoped 
that the situations which admit convenient analyses of variance for the 
type of design described, can be studied and discussed in the future. 
At present, it can only be said that it appears that the more factors 
there are at an even number of levels, the easier the analysis of variance. 


5. The Estimates of the Missing Observations. 


In this section, the expression ‘‘missing observation” is used with a 
somewhat. different meaning than is customary. Usually a missing 
observation is one which was planned for but not obtained. In what 
follows, however, it refers to an observation which was purposely not 
obtained. 

Estimates of the missing observations are found by simply combining 
the estimates of the effects with appropriate coefficients. For example, 
the estimate of 11122 would be @ — A B- C ee De Ey + AB +. 
aC AD Be BDA COD= AE, ye = OR; Die 

By the nature of the mathematical model and in the absence of 
three-factor interactions, the estimates are, of course, unbiased, in the 
2* x 4 experiment. In the 2* X 3 experiment, due to the partial con- 
founding, estimates of the missing observations obtained in the tra- 
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ditional manner, are biased. In fact, it can be easily shown that in this 
design the bias in AB is — CD/3, im AC, — BD/3 and so on for the 

other interactions involved in the Settle confounding. Unbiased 
estimates of the missing observations can be formed but these will not 
be discussed here. 

It will be interesting to investigate the precision of the estimates. 
First observe that if the variance of an observation is a”, the variance of 
an estimated effect is in general o’/n, for the estimate of an effect is 
equal to 1/n multiplied by a sum of nm independent variates. It can be 
seen in Table I that in the 2* X 4 half replicate the estimates of effects 
are orthogonal and so the variance of an estimate of a missing effect is 
easily found. It would be the sum of the variances of the effects, which 
for the 2* X 4 experiment would be 


When the estimates are correlated, as in the 2* X 3 experiment, 
it is a little more complicated to find the variance of an estimate of a 
missing observation. However, a glance at Table I shows that if the 
observations are divided into three groups on the basis of the last sub- 
script, and if the portions of the estimates associated with these groups 
are designated y, , y2 , and y; , then two two-factor interactions which 
are confounded will differ in y, , y. and y; by only two signs. For 
example, if we write 


AB = yr t+ Yo t+ ¥s , 
then 

CD = SY Ue ae 
where 


yi = —(21111) oy, = +(11112) yy, = —(21113) 


—(12111) +(22112) -#(12113) 
+(11211) = (21212) +(11213) 
+(11121) — (21122) +(11123) 
4-(22211) ~ (12212) +(22213) 
+(22121) (12122) +(22123) 
— (21221) +-(11222) — (21223) 
— (12221) — (22222) — (12223) 


TABLE IV 


Estimates of Missing Observations in the 24 


b 


X 3 Experiment 


8 in the column under the subscripts with the effects in the column on the 


servation in the top row and associate the sign, 


To estimate a missing observation, find the subscripts of the o 


extreme left. 


12222 11113 22113 21213 21123 1 


9 


212 11122 22221 22212 22122 2192 


11111 22111 21211 21121 21112 12211 12121 12112 11221 11 


Effect 
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We shall need cov (AB, CD) 
cov (AB, CD) = cov [—yi + ¥3 — yi — 2yrys] 
= —a' (ys) + o'(y2) — 0° (Ys) 


o°(y:) = 0° (Y2) = 0 (Ys) = = = c 


Hence cov (AB, CD) = =o [125 
Similarly cov (AC, BD) = cov (AD, BC) = — ¢°/72. 

Table IV gives all the estimates of the missing observations. 

The variance of an estimate is equal to the sum of the variances 
of the effects plus the sum of the covariances of the effects with the 
proper sign prefixed. It will follow then that the estimates will have 
different variances depending on how the signs of the terms which make 
up AB and CD, AC and BD, AD and BC agree or disagree. Table V 
gives the variances of the estimates of all possible types of missing 
observations. 

It can be seen that although the variances differ, they differ by 
very little, and this is likely to be the case for higher order factorials 
and when the factors are at a greater number of levels. 


TABLE V 


Variances of the Estimates of all Possible Types of Missing Observations 
(24 X 3 Half-Replication) 


Condition of signs 
within the sets 
> —~ Example 
AB and CD, missing Variance 
AC ane observation 
La Bae from 
AD and BC Table IV 
All three sets PAPA) 2 2 <: 
agree in sign 165) a 3%) x a 3 
All three sets 11212 2 2 
disagree 162) z 4 = a 2 


Two of the sets 22111 3 o 2 
disagree, and one 16 =) — (=) + (==) = 
set agrees 
One of the sets 12112 2 2 2 ; 
disagrees, and two 16) es + {= 2) ek oe 
agree 

SS an SN I i ee ee eee «ed 


igh 


ars 
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TABLE VI 


Experimental Data and Estimates for 2* X 3 Experiment 


fees Dependent Variable Oe Dependent Variable 
tion Observed Estimated Hon Observed Estimated 
11113 5.30 4.52 21113 6.05 
11123 6.65 21123 (shots) 7.96 
1112 4.60 21112 3.95 4.10 
11122 S345 5.68 21122 4.95 
11111 4.90 4.70 21111 1.60 
11121 6.05 21121 1.80 3.38 
11213 20.15 21213 18.85 19.00 
11223 25.15 26.36 21223 25.00 
11212 19.80 AV ee 21212 16.80 
11222 26.70 21222 PPA TAD) 23.14 
11211 19.20 21211 15.20 14.80 
11221 PAS eg AE 24.40 21221 20.45 
12113 5.85 22113 6.00 5.78 
12123 fee 8.10 22123 7.80 
12112 4.70 5.86 22112 3.60 
12122 5.95 22122 4.45 3.44 
12111 4.40 22111 0.60 —.12 
12121 5.85 4.44 22121 0.40 
12213 17.30 17.10 22213 14.35 
12223 23.45 22223 18.70 20.46 
12212 14.70 22212 11.25 9.60 
12222 20.65 20.30 22222 14.95 
12211 12.50 13.82 22211 8.05 
12221 WATE 22221 10.05 12.42 

6. Example ss 


Experimental Towing Tank data were available in which five 
independent variables, termed A, B, C, D and E, were involved. A, B, 
C, and D were each at two levels and E at three levels. For this Deas 
experiment all possible 48 data points had been obtained experimentally, 
and the experimental error had been determined from 30 repeat runs 
to equal .19 (i.e. é? = .19). It was decided to set up an estimation design 
for this experiment and to compare the results of the half-replication 
with that of the full-replication. The design for the 2* X 3 experiment 
has already been formed. Table VI shows the full set of data points 
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TABLE Vila 
Analysis of Variance, Full Replication 24 X 3 Experiment 


Source sh feb ok, be m. 8 F 
A 97.61 1 97.61 PIG. Pee 
B 124.00 1 124.00 269 .5* 
(6; 2214.76 1 2214.76 4814.8* 
D 131.50 1 131.50 285 .8* 
E 129.06 2 64.53 140.2* 
AB 3.23 i 3.23 7.0* 
AC 20.09 1 20.09 43 .6* 
AD 4.24 ik 4.24 9.2* 
BC 110.87 1 110.87 241 .0* 
BD 2.50 1 2.50 5.4* 
CD 58.20 i 58.20 126. 5* 
AE 25.58 2 12.79 2728 
BE 10.80 2 5.40 ipl re 
CE 3.05 2 1.52 3.3 
DE 2.80 2 1.40 3.0 

Residual 12.45 27 46 

Total 8. S. 2950.74 47 62.78 

*Indicates significance at the 5% level. 

TABLE VIIb 


Analysis of Variance, Half-Replication 24 X 3 Experiment 


Source 8. 8. d. f m. 8 F 
A 41.21 1 41.21 §073* 
B 56.89 1 56.89 69.4* 
C 1115.89 1 1115.89 136124* 
D 69.19 1 69.19 84.4* 
E 61.77 2 30.88 Shi. th 
AE 12.50 2 6.25 7.6* 
BE TOG 2 S205 4,3* 
CE 0.86 2 0.43 ORD: 
DE 1.73 2 0.86 1.0 

AB, AC, AD, 

BC, BD, CD 93.46 6 15.58 19,0* 

Residual S. 8. 2.46 3 182 

Total 8. 8. 1463.03 26 63.61 


*Indicates significance at the 5% level. 


———————————— 
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and points estimated under the design. Comparisons of observed and 
estimated points should be made in the light of ¢ = .45, and an error of 
estimation of approximately 2/3 é = .37. Also, since some of the 
effects are partially confounded, each estimate has a bias: The standard 
deviation of the difference between an observation and an estimate is 
.58. For the 24 points estimated, the maximum discrepancy is equal 
to about 4 standard deviations (of a difference). 

It may be noted (Tables VIIa, VIIb) that at the 5% level of sig- 
nificance the results of the half- and full-replicate agree. At the 1% 
level, the results on the BE interaction would differ. In general, of 
course, a residual sum of squares carrying only three degrees of 
freedom is not a good denominator for the F-ratio. In much engineering 
work, the only estimate of error which is trusted is determined from 
repeat runs and, if possible, it is best to use an estimate thus obtained 
rather than the residual. If the error estimate obtained from repeat 
runs (é? = .19) had been used in the two analyses of variance, only 
the CE interaction in the analysis for the half-replication would not 
have shown as significant at the 5% level. It is also moans that 
in both analyses the ratios of the residual mean squares to 6” 19 
indicate the presence of some three-factor interactions. 
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BLOCK EFFECTS IN THE DETERMINATION OF 
OPTIMUM CONDITIONS 


Rosert M. DeBsaun 
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The second-order “‘central composite’’ designs proposed by Box and 
Wilson (1951) for the determination of optimum levels of a set of 
continuously variable factors consist of a factorial design in full or 
suitable partial replication in the n factors under study plus a cross 
polytope of radius a with 2n points on the exterior of the design 
(radius = a) and g replications of the central point. In practice, these 
designs tend to arise as follows: The experimenter assigns levels of the 
true experimental variables to the coded levels of the design variables 
(0, + 1, + a), and then conducts a factorial experiment (in full or 
partial replication) at levels X; = +1, @ = 1, 2,---,n.) This design 
is conducted in a sufficiently large partial replication of the full factorial 
so that a first-order response surface (J) can be fitted to the yield values 
at the X; = -+ 1 points in the factor space. 


()  Y=B+B.X,+ BXs+ +++ BX, + BX, + --- BX, 


Sufficient points are usually included in the factor space so that at 
least some second-order terms (in B;;X;X;) can also be estimated. If 
the magnitude of the second-order coefficients is such that the first- 
order approximation will be insufficient to describe the response surface 


accurately, the points are added at the center 0, 0, 0, --- , 0 and the 
cross-polytope is added with points at + a, 0, 0, --- , 0; 0, + a, 0, 
- ,0;--- 50,0, 0,--+,-+:a. The inclusion of the additional points 


permits the fitting of a second-order response surface to the yield data 
(II), from which the response can frequently be characterized adequately. 


(II) Yo Bot) Bike Hips Bik kg ee 
w=1 4=1 q=1 ‘ 
i=2 
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The following considerations have been found useful in our appli- 
cation of these experimental designs. If in the preliminary block of 
experiments, i.e. the “factorial” stage, one or more replications are 
added at the point 0, 0, --- , 0, (X; = 0), then the mean yield at the 
peripheral points (XY; = + 1) less that at the central point estimates 
the sum of the quadratic coefficients (B;;). If this value appears large 
relative to the absolute values of the B; estimated, the response surface 
is definitely revealed as curved, and a good estimate of it will require 
the fitting of the full second-order model. If not, it is often profitable 
to move along a line of “steepest” ascent in the factor space, before 
attempting to fit the full second-order model. 

In adding the cross-polytope, it may be necessary to consider 
“block” effects. Depending on the level of the design variables selected 
for a, the other coefficients, in particular the important B,; , may be 
biased to some extent The bias is determined both by a and by the 
amount of shift in B, . However, if the block effect is such as to alter 
only B, , the nature of the response surface can still be determined 
without bias due to a block effect. 

For example, in a five-factor experiment, the first block might 
consist of the half-replicate of the 2° factorial design (I = ABCDE), 
plus four replications at X; = 0. This block provides one degree of 
freedom for B, , fifteen for the B; and the B;; , one for the summed B;; 
and three for error. If the B;; and the summed B;; appear small with 
respect to the B; , the approach along the line of ‘‘steepest ascent’’ is 
indicated. If it is desired to fit the full second-order model by adding 
the cross-polytope, the possibility of the block effect needs to be con- 
sidered. The expected value for the mean from the first block is 
B, + 4/5 a B;;. The block contrast will be orthogonal to the second- 
order response surface if the expected value of the mean for the block 
containing the cross-polytope is the same. The expected value for this 
mean is By + (2.07/n.) >, B;; where ais the radius of the cross-polytope 
and n, is the total number of points it contains. Such a block would be 
the ten points + 2, 0, 0, 0,0,;--- ;0,0,0,0, + 2. The block contrast 
will also be orthogonal to the third-order response terms, although these 
will not be free of the B; and B;; . , 

If effects of order higher than two are suspected on completion 
and analysis of the two blocks, a third block can be run, consisting of 
the other half-replicate of the 2° factorial design plus four more repli- 
cations at X; = 0. The B;;; will now be partially confounded with the 
B; , but the B;;, will be free of the second-order response surface and 


of the block contrast. 
In this example, the length of a is exactly that suitable for a 
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“rotatable” central composite design (Hunter, 1954). The inclusion of 
additional points at the center (X; = 0) also enhances the design, as the 
location of the maximum is made more precise by reducing correlation 
among the estimates of the B,;; (Box and Hunter, 1954). 

It can be seen that the inclusion of central points in the various 
blocks that might arise enables the experimenter to balance block 
effects, rotatability and uniform information, subject only to his in- 
genuity in blocking the ‘factorial’ portion of the design. 

A similar, independent treatment of the blocking problem in central 
composite designs is described by Box and Hunter (1955). 
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ADJUSTMENT BY COVARIANCE AND CONSEQUENT TESTS 
OF SIGNIFICANCE IN SPLIT-PLOT EXPERIMENTS* 


JEANNE Titus Trurrr** anp H. Farrrreip Smrri 


North Carolina State College 


1. INTRODUCTION 


This paper investigates methods for adjusting experimental data by 
covariance on a concomitant variable, and of appropriate ensuing tests 
of significance, when, as in a split-plot experiment, the analysis of 
covariance has two or more rows for errors at different levels, each 
yielding regressions of the dependent variate on the concomitant 
variable which may be assumed to be equal. Somewhat to our surprise 
when seeking. examples six out of nine split-plot field experiments 
examined showed significant differences between the two regressions. 
This suggests that the assumption may not be so generally valid as 
might be expected, but the sole concern of this paper is with what to 
do when the assumption is permissible. 

Anderson (1946), when discussing how to deal with a missing sub- 
plot, suggested the procedure described in sec. 3 below, but warned that 
the two mean squares so evaluated would not be independent and the 
F-test therefore not valid. Stimulus for the work here reported derived 
from finding that method being promulgated without Anderson’s 
warning as a general procedure for covariance adjustment in split-plot 
experiments; whence it seemed worth while to endeavour to work out 
just what the procedure implied and the distributions involved. 

Bartlett (1937) stated a different procedure (sec. 4 below) without 
discussion or proof of its validity. Being given in a single sentence in 
a paper dealing with many other matters his statement seems to have 
been generally overlooked and was unknown to us until after we had 
derived his test independently on the grounds discussed below. : 

Assume given a split-plot experiment with m main-plot treatments 
in r randomized blocks and with q split-plot treatments in each main- 
plot. Let y be the observed experimental variate, and z be a concomitant 
variable unaffected by treatments and measured by deviations from 
its mean over the whole experiment. Assuming the regression of y on 
x to be the same both between main-plots within blocks and between 


*Sponsored in part by the Office of Ordnance Research, United States Army, 
under contract DA-36-034-ORD-1517. 
**Now at Dayton University. 


24 BIOMETRICS, MARCH 1956 


split-plots within main-plots the usual linear (regression) model is 


Yiir = Bb +a; + p; + 4; Ar oh ae (ay) ix + BXiin + és5% (1) 


pe TL Uaas i 

ge leer 

kes 1 tong 
5,; = main-plot error NID (0, o3) 


€;;, = split-plot error NID (0, o-) 
ar? a hee: me De a De (ay) iz a Ds (er) ix = 0 


vi;, are usually regarded as a given set of constants, as in ordinary 
regression analysis, but this viewpoint will be later modified. 


TABLE 1. 
Notation for Sums of Squares and Products 

Source lait. y? xy x? 
Replications (r — 1) —_— -——— —-- 
Main treatments vu = (m — 1) Myy Mey Me 
Error D vp = (m — 1)(r — 1) JD Se, jks Dez = D 
Sub-treatments (q — 1) Sah Sry Se 
Interaction (m — 1)(q:— 1) SMyy SMey SMor 
Error EH Ve = mq — 1)(r — 1) Ey Exxy J Di eae IaH 


Table 1 shows the notation which will be used to designate sums of 
squares and products in the analysis of variance and covariance. For 
ease of printing subscripts will be omitted from the three symbols to be 
most frequently used; wherever M, D and E occur without other 
indication subscripts az are to be understood; when they occur with a 
superscript, subscripts yy are to be understood. Each row yields an 
estimate of a regression coefficient and the sum of squares of deviations 
about the respective regression. For example by = M,,/M,, ; and 
M* = Miss a buM.,, SSK aS (Yi.. SP ae buty..) ee the sum of 
squares of deviations of the main treatment means, y;.. , about that 
regression. Similarly b>) = D,,/D,, ; and D* = D,, — bpDy = 
Q Doidui Wis. — Yew — Ys. HY... — bv Gu. — &.. — 2.3)? = the 
sum of squares of deviations of main-plot means from respective 
members of a set of parallel regressions through a compound of treat- 
ment and block means (x... being zero by definition of 2). 
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Treatment effects are defined as differences between mean yields of 
the observed treatment combinations. Main plots, being’ balanced 
for sub-plots, form simple randomized blocks within which bp provides 
an estimate of 8. Significance of main-treatment effects adjusted for 
x can be tested in the usual way with the “reduced” treatment sum 
of squares (M + D)* — D*. 

The main plots form mr blocks of split-plots but with the addition 
over simple randomized blocks that m groups, of r “blocks’’ each, are 
“Gdentifiable” (Smith, 1955b) by main treatments to yield an estimate 
of the main X sub-treatments interaction. Since we consider only the 
factorial model row F gives the estimate of error for sub-treatment 
effects, as well as for interaction, and another estimate of 8. Reduced 
treatment and interaction sums of squares follow in the usual way for 
tests of significance. 

These tests are well known and unambiguously valid. The reasons 
for considering alternative procedures are: (1) given the postulate that 
regression is homogeneous at both levels each part of the analysis uses 
only part of the available information on 6. Therefore greater accuracy 
should be possible from combining both parts to obtain a single estimate 
of the coefficient. And (2) results adjusted by different coefficients 
cannot be presented in a two way table for crossed treatments with 
self-consistent means. Although not important this is unpleasing and 
complicates neat presentation of results. 


2. MAXIMUM LIKELIHOOD SOLUTION AND OTHER WEIGHTED 
MEANS OF MAIN- AND SUB-PLOT REGRESSIONS 
The maximum likelihood solution for 6 (Cochran, 1946) is the 
weighted mean of estimates from each error row: 


- _ bp D/oi + bz E/o; 
P= Dios + Ele; 3 


where a1 = g. + qo; = the theoretical error variance for main plots, 
a; = o. = the error variance of sub-plots. In the complete maximum 
likelihood solution the estimates ¢; , ¢: depend on 8 ; they can only be 
obtained by iteration and resulting distribution theory is difficult. 
Hated almost all the information on these variances is contained in 

= D*/(vyp — 1) and s; = E*/(vg — 1); the full maximum likelihood _ 
aera being equivalent to salvaging only a small fraction of a degree 
of freedom for each. We may therefore be content to accept Si, $2 a8 
estimates of o; , o2 ; but exact sampling distributions will still be ex- 
cessively difficult to determine if they, random variables, be substituted — 
in (2). Little information may be lost, and simplification gained, with 


26 BIOMETRICS, MARCH 1956 


an arbitrarily weighted mean, say 
b, = chp + (1 — c)bz (3) 


Ife =c, = D/(D + oF), where w = oi/o> , this becomes the maximum 
likelihood solution with minimum variance for known weights. If w 
be replaced by an arbitrary w, perhaps guessed from previous experience, 
the variance of b; is less than the lesser of var (bp) and var (bg) provided 


0 <c < 2c, , or equivalently ~ > w > (wH — D)/2E > 0 
or (2c, — 1) < ¢ < 1, or equivalently 20D/(D — wk) > w > 0 


Let R stand for the capital letter in any single row of table 1. If 
we form an adjusted sum of squares with any arbitrary or estimated 
coefficient b’ 


Re = Reb hay Chg (4) 
= R* + R,,(be — 6’) (5) 
Under the null hypothesis of no treatment effects 
&(R*)/(v — 1) = Ri€(be — 8)” = a3 


where » is the degrees of freedom in row R, and z = 1 or 2 according as 
R is in the upper or lower part of table 1. Also R* and be are inde- 
pendent. Therefore if b’ with expectation 6 be also normally distributed 
independently of R*, then R* is distributed as the weighted sum of 
two independent x’, namely 


xXr-10; + xiR,, var (be — D’) (6) 


If & represents a treatment row, say main-treatments, and if treat- 

ments have an effect, 
eds) = dg airs 4g 

The first term on the right is a treatment contrast which we want to 
retain in the treatment sum of squares. The purpose of adjusting is to 
estimate it ‘decontaminated’ from 8. We would like to do this in such 
a way that the total adjusted sum of squares will have a manageable 
distribution and be suitable for testing against an independent error 
mean square from another row or rows of the analysis. Since the 
distribution of a sum of x”’s with unequal weights is intractable the only 
simple solution is to make M* proportional to a homogeneous x?. 
Since (5), when R = M, inevitably contains M*, (6) shows that that 
can be done only if we can find a b’ and weight k such that 


kM,, var (by — b’) = o; ; (7) 
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Using a weighted mean of the two error regressions, 6; , we have 


a Mc’ , M(1 — o)’ 

M var (Dar a br) = (1 -+ D -{- oH ) Je’ 
and to meet condition (7) k must be the reciprocal of the factor on the 
right. In ignorance of w it can be known exactly only if ¢ = 1; that is, 
we use bp alone with k = D/(M + D).° This leads to the usual reduced 
sum of squares; or is equivalent to using 


on 3 * | D D 
b =(1 rp) bu + M+D" 


so that (bs — b’) is proportional to (bs — bp) and 


‘ MD 
M var (bay i b’) = M wT Tae a (bar i bp) = a; 
If we use c = D/(D'+ wE) for a guessed w, and 


Heisler 


then 


M (D — De — wkEc)(1 — c)\ 2 
oB (D + Mo) Je (8) 


This is too small if c, < c¢ < 1 (the minimum is too complicated to be 
worth reporting) and is too large when c < ¢, , reaching a maximum 
(for 0 < ¢ < 1) atc = 0. The discrepancy is then oi M/wE. If w is 
a good guess for w, or even if it is bad but EF is large, the discrepancy 
may not seriously affect tests made as if w were known. However if 
only one estimated regression is to be used for all adjustments most 
workers may prefer the simpler approach of section 4. 

The error variance of a difference between two means adjusted by 
b, is fairly obvious. Although its estimate involves s; that component 
will usually be so small a part of the total that the estimate can be taken 
as based on the degrees of freedom associated with si . Converse 
arguments apply for contrasts among split plots. 


3. ANDERSON’S METHOD 


Using covariance on dummy variables for missing plots Anderson 
(1946) noted that bp , the estimate for a missing sub-plot which minimizes 
D*, neglects all information on yield of the partly missing main-plot 
which is contained in its existing (¢ — 1) sub-plots. He therefore recom- 
mended completing the main-plot yields with the sub-plot estimate, 
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b, ; and suggested that appropriate reduced sums of squares for main- 
treatments and error, which would be unbiased in the sense that they 
are freed of 8, might be 


M° = (M + E)* — E* 
D° = (D+ E)* — EF 


He noted that these sums of squares are not independent and so would 
not yield a valid F test but did not investigate the amount of disturbance. 

If one wishes to use only one estimate of 8, and to avoid complexities 
associated with a compound of bp and bz decides to use just one of them, 
one naturally chooses that one which is more accurately evaluated. 
Several authors have pointed out that usually var (bz) < var (bp), 
that is o3/E < o:/D, because o2 < o; and usually H > D owing to the 
larger number of degrees of freedom associated with E although the 
mean squares may be in reverse order. We assume this to be so and 
that bz is to be used throughout. Converse arguments apply in the 
exceptional case of bp being more accurate. 

Using bz the tests of significance for sub-treatments and interaction 
obviously go through in the usual way. For main-treatments it has 
been proposed to use Anderson’s method as a general rule There are 
three defects. 


ME 
M+E 


M®? = M* + (bu — bz)” (9) 


D° = D* + (10) 


Dag tte -> x) 


M™* and D* are independent of all three regression coefficients, and, in 
absence of treatment effects, have expectations 


&(M*)/(m — 2) = &(D*)/[(m — DN — 1) — 1) = of = qos + 0% 
Writing R to stand for M or D 


_ RE _ 2 a _ Eo; + Roz A Eqo; 


Consequently expectations of the mean squares are 


SEG ae a ges Moti _ oe aves ; 
m—-1 ( = (m — 1)(M + 5) UPR (11) 
et COS) ne D rate 

(Git tee ING) cis ( . aS ye one ce 
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These are not equal (except for special values of M, D and E) unless 
a; = 0. In the special case studied by Anderson, with mrg = N 

M = (m — 1)/N 

D = (m — 1)(r — 1)/N 

E = m(q — 1)(r — 1)/N (13) 


Whence the discrepancy is only 


) 


| 


I 


—(m — 1)(r — 2)qoj 
(m — 1)*°(r — 2) + ve(ve + mr — r) 
The absolute discrepancy must be less than go;/(m — 1) and will usually 


be much less. When z is randomly distributed it will usually be negative 
(that is &(treatment m.sq.) < &(error m.sq.)), but may be > 0 if 


aye ieee 2) 


(This has happened in both examples 1 and 2, sec. 7.) 
Secondly, the sums of squares are distributed as the weighted sum 
of two independent x’, namely as 


2 2 22 a) 
Xvp—-191 =e xi(o! en E (14) 


The discrepancy from homogeneous xo; may not be serious if o% is 
small or if E is large relative to M and D. 

Thirdly the x; terms in the two sums are correlated owing to both 
containing b; . Their correlation coefficient is equal to the square of 
the correlation between (by — bg) and (bp — bz) and is: 


MD 
(M+ oB)\(D + #8) oe 


The overall correlation of the two mean squares is: ‘ 
MD((M + E)°vu — Yo? + (M + of)’J? (16) 
‘[(D +E) — lw + (D + wk)? 
All three disturbances will be small if Z is large relative to M and _ 
D; and having further regard to the condition that they operate only : 
on the square associated with a single degree of freedom in each row 
their overall disturbance of the F' test is likely to be trivial, possibly 


no worse than the effect of departures from normality in the distribution _ 
of observations. Nevertheless most workers may feel that the foun- 


a 
- 
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dation for the procedure is unsatisfactory and may prefer the method 
of section 4. 


4, BARTLETT’S METHOD 


Except for desire to gain additional accuracy one would usually have 
no hesitation in analyzing the results of an experiment without reference 
to a concomitant variable. Indeed one regularly does so either because 
one has not thought the effect of a variable x to have been sufficiently 
great to be worth evaluating or because observations of it have not been 
recorded. In other words the effects of x are lumped with experimental 
error with the assumption that they have been randomized with treat- 
ments as for all other components of plot error. Alternatively, knowing 
x, one may adjust on theoretical considerations, on a priori experience, 
or simply on a guess. An example of the last is to analyze the differences 
of some character after and before the application of treatments, 
equivalent to assuming b = 1 for the regression of final on preliminary 
yields. 

Suppose the true model to be as in equation (1) and that we adjust 
yields on the basis of an arbitrary regression coefficient b, . (No ad- 
justment is included by putting b, = 0). The analysis of variance 
may be done directly on adjusted yields, 2;;, = Yi;zx — bo¥:;, ; or be 
derived from covariance analysis on the original data as in table 1, 
adjusting each row by 


ee == Ryy eo 2boltsy ae boRez . 


Now b, , being independent of main-plot rows of the analysis, can be 
introduced into them as by in the above argument. Valid tests follow 
provided that x,;, have been associated at random with treatments. 

For those-who consider it improper to treat x as a random variable 
(a point of view for which we are indebted to a referee) we can regard 
(bo — B)x;;. as fixed arbitrary deviations which become a part of the 
deviations of z;;, about which no postulate need be made except that 
they are randomized with treatments. The potential distribution of 
individual mean squares will not then follow a continuous standard 
form, but randomization in many experiments (with different sets of 2) 
will generate a distribution of their ratios which is approximately that 
of F (Pearson, 1937, and associated papers). 

However, in the same way that it is convenient to postulate y as a 
normally distributed random variable, whether accepting this as a basic 
postulate or as a device to simplify the argument on the grounds that 
it leads to the same approximation and has been empirically justified 
by experience, one may postulate as part of the model that z is also 


patel 


SPLIT-PLOT EXPERIMENTS dl 


a normally distributed random variable with structure 


Lie = M+; + ay; + e355, 
where d;; are NID(0, 03), e;;, are NID(O, o2), and 
&(M)/(m — 1) = &(D)/vp = gop + of = o?, 
&(S.2)/(q — 1) = &(SM,,)/(q — 1)(m — 1) = &(E)/va = 3 = 073, 


The expectations of adjusted y mean squares are then as usual except 
that error components o; have to be replaced by o; + (bo — 8)? 07, ; 
and on the null hypothesis sums of squares are independently distributed 
as x°(o; + (bo — B) o%2). 

If bo be replaced by bz the argument applies only to the main plot 
analysis. To test sub-plot effects we must fall back on the usual reduced 
sum of squares. On observing that the main plot analysis might be so 
treated we felt that some question might be raised about the propriety 
of treating x as a random variable, or as a set of randomly allotted 
deviations, while still applying to sub-plots a regression analysis which 
usually regards x as given constants. We therefore asked: What then 
is the appropriate test for sub-treatments when assuming x to be a 
random variable? This may be answered along well known lines by 
saying that if we demarcate critical regions with probability a for each 
conditional distribution given a set of x, the composite critical region 
obtained by integrating these over all sets of x will still have probability 
a for repetitions of the experiment.* However Sir Ronald Fisher 
(personal communication) would say that, in formulating a test of sig- 


*Under the null hypothesis the conditional distributions for given x are independent of z and the 
result for control of type I error follows immediately. But under the non-null hypothesis the means of 
linear functions, whose squares form partitions of the respective sums of squares, are functions of the 
treatment constants and of means of z. Integration to obtain the marginal] distributions and thence the 
average power of such tests seems impracticable. However while investigating this approach (before 
communicating with Sir Ronald Fisher and reading Barnard, 1950) we noted: (1) That in any given 
experiment the set of z may be regarded as ancillary statistics in the sense of Fisher indicating the 
accuracy of estimates obtainable from that particular experiment (as contrasted with the average 
accuracy over experiments in general). (2) That the theoretical average power over many experiments 
for some postulated distribution of x could be of interest in planning experiments but would be of little 
practical use. What it might tell about potential accuracy and thence about the size of experiment 
required to achieve a stated power would usually be trivial relative to uncertainty about error variances 
and regressions which would in fact be found. The distribution of x would usually be neither prescribable 
nor at our disposal. If the latter, we would plan to equalize all treatment means of zx so that treatment 
sums of squares for x would vanish, the distribution of treatment contrasts would be again independent 
of z, and the non-central parameter in the distribution of treatment sum of squares would have its — 
maximum value, proportional to Yat, since no adjustments would be required. 

Barnard (1950) reached similar conclusions: ‘‘Probabilities are relevant before an experiment has 
been performed, when we are planning it. After the experiment has been performed, when we jare 
drawing conclusions, likelihoods are relevant. As a theory based on probabilities, the Neyman-Pearson 
theory is useful in planning, before the result is known; but after the result is known, the theory of 


likelihood should be used.” 
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nificance, to ask for the frequency of occurrence of an event in repeated 
samples would be the wrong question. On re-reading the paper by 
Barnard (1950) we feel obliged to agree. 

Fisher’s viewpoint seems to have been first expressed in his paper of 
1922, where he discusses K. Pearson’s consideration of the effect on 
regressions of sampling fluctuation of the numbers of observations in 
each array of a correlation table. Fisher wrote: “The difference in 
principle is of some importance, since the simplicity of many of the 
results here obtained is a consequence of the fact that we have not 
attempted to eliminate known quantities, given by the sample, from the 
distribution formulae of the statistics studied, but only the unknown 
quantities— parameters of the population from which the sample was 
drawn—which have to be estimated somewhat inexactly from the given 
sample...”’. ‘This mixed distribution [for means of arrays of different 
sizes] need not concern us, however, for in applying tests of fitness we 
do not in practice ignore the size of the array.” 

Barnard concludes: “. . . the arbitrary nature of the reference set 
involved, on the Neyman-Pearson theory, in a test of significance, is a 
decisive reason for rejecting that theory, as a theory of inference, in 
favour of using a theory of inference, such as that given by Fisher, 
where the idea of a reference set does not enter.” 

We conclude therefore that justification for Bartlett’s procedure 
rests best on fiducial inference. Since this requires no postulate for the 
reference set of sets of « which might appear on repetitions of the 
experiment, alternative assumptions about x make no difference to the 
appropriate test for sub-treatments. 


5. APPLICATION TO MISSING SUB-PLOTS 


The argument of section 4 fails to carry over to analysis of main- 
plots when estimates (regression coefficients on dummy variables) are 
inserted for missing sub-plots. Even supposing that the missing sub- 
plot or plots occur at random, which will often not be true, it does not 
seem possible under any circumstances to regard the dummy variables 
asrandom. The sum of squares for any one of them in row R is identi- 
cally ve/N where vz is the degrees of freedom for the row and N = rmgq. 
The dummy variables must therefore be treated as arbitrary constants 
throughout. 

We consider in detail only a single missing sub-plot for which we 
put « = —1. For all other plots c = 0. The estimate of the missing 
yield is bz . Analysis of variance, using by as if observed, is then 
equivalent to evaluating adjusted sums of squares as in sec. 4. We 
can now take either of two points of view: (1) bg is a random variable 


<< _,—l ier 
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. = ae 2 / ry = 2 ‘. z ‘ . a . 
with variance o2/H = o2N/v» ; or (2) that with respect to the main-plot 
analysis it is, as in sec. 4, a constant. 
On viewpoint (1) a main-plot sum of squares is distributed as 


2 
sso + x3R.. var (be — be) = xtaot + xi(oh +22) a7 
E 


On view (2) it is distributed as non-central x with parameter (bg — 6)? 
ve/N whose average value (over many experiments) should be o3vp/vz . 
Although these distributions are slightly different presumably the 
quasi-F’ ratios should have the same distribution. On either view the 
expectations of mean squares on the null hypothesis are equal for all 
rows. 

Comparison to Anderson’s method is most easily made on view (1). 
The difference lies in the multiplier of (bg — bz)” which is now, by equa- 
tion (5), R., in place of R,,H,./(R + E),, as in equations (9) and (10). 
The consequences are: (i) The null expectations of mean squares are © 
equal for main treatments and error; instead of being slightly different 
as in equations (11) (12). (ii) We still-have weighted x’ distributions, 
the factor with x; being now slightly inflated by a factor proportional 
to «2, (17), whereas formerly it was deflated by a factor proportional 
to ga; , (14). (iii) The correlation of the x; terms remains the same as 
formerly; but their multiplier being now slightly larger, namely o3 
(R,, + wH,.)/E,2 in (17) instead of o3(Rez + wH,.)/(Ree + E,2) as in 
(14), the effect will be a shade more serious. As contrasted with (16) 
the overall correlation of treatment and error mean squares now has 
E’ in place of (M + E)’ and (D + E)’ in the denominator of (16). These 
can of course here be evaluated in terms of degrees of freedom but the 
expression is too complex to be worth bothering about. Numerical 
values are given for example 3 in section 7. 

On balance we may prefer to use for the main-plot analysis the 
adjusted mean squares (sec. 4) rather than the reduced ones (sec. 3) 
on the grounds of maintaining equalized null expectations for the mean 
squares and easier computation. 

With several, say k, missing sub-plots and estimates b; ,7 = 1 --- k, 
we obtain on view (2) a non-central parameter sioconiend to 
[>>. (0; — 6) and the additional complication that expectations of — 
treatment and error mean Sees may not be identically equal since 
the sums of products M,,,.; , Dz;2; do not remain proportional to the 
respective degrees of freedom unless all missing plots are in the same 
replication. The disturbance may be expected to be comparatively 


trivial. 
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6. POWER 


For m treatments with effects a; , in r replications with variance a 
per unit plot, the criterion for entering tables of the power of analysis 
of variance tests as prepared by Tang (1938) is ¢ =r da [me . For 
a straight main plot analysis in above notation this is rg >> a?/mo; = K, 
say. After adjusting by covariance using bp the non-central parameter 
of the reduced sum of squares becomes reduced so that the criterion is 


x(1 3 ry ts..0:)) 


M+E 3) 


If treatments are randomized with zx so that the average correlation of 
z;.. and a, is zero, the average value of (18) is (Smith, 1955a) 


K(1 - ao (19) 


By Bartlett’s method the non-central parameter is retained proportional 
to >) a; but error variance is inflated leading to 


K[1 + (bz — 6)’o%i./o1]" (20) 


The average value of (bg — 8)” is o2/E and the expected value of £ is 
Yz0z2 ; therefore (20) is asymptotically 


K(1 + ,/wvz)* (21) 


where w = 03/02, @: = Ciz/o2z- If we suppose that the ratios w and w, 
may be approximately equal (21) becomes approximately 


K(1 — 1/rz) (22) 


There is also one more degree of freedom for the estimate of error. 
Bartlett’s method may therefore be expected to have slightly more 
power on the average; but equations (18) (20) show that the comparison 
is susceptible to the x;..«; combinations, actual error in estimating 
b, , and the distribution of x, so that any individual case may show 
appreciable shifts in either direction. 

An alternative method of comparison is to compare the average 
variance for a difference between two treatment means. Adjusting 
by bp this is (Cochran, 1940; Finney, 1946) 


Q0% M, 204 
Tq E ar (m — a |s2 ieee + 1/rp) (23) 


By Bartlett’s method it is 


2a 


2[oi + (bs — 8)’oi2]/rq > — A + w,/wvg) (24) 


SPLIT-PLOT EXPERIMENTS 35 


leading to a comparison very similar to the previous one. The efficiency 
relative to adjusting by bp may therefore be expressed as 


1+ 1/p ( 2 
—_—__—_—— 1 Sennen 4 
1 + w,/wrez 7 vp(vp + 3) cy 
where the last factor allows for the extra degree of freedom in the 
estimate of error (Fisher, 1935, sec. 74). 
Perhaps the main advantages of Bartlett’s method, rather than 


increased efficiency, are simplicity and agreement of sub- and main-plot 
means. 


7. EXAMPLES 


In the following examples we give estimates of the disturbances 
(discrepancy in expectations of treatment and error mean squares on 
the null hypothesis, and correlation of the mean squares) by substituting 
estimates of parameters as follows: 


s; = reduced D mean square = D*/(yp — 1) 

s, = reduced H mean square = E*/(vz — 1) 
= —-5, @= 3/8, 6, = v2D/ypE. 
sa = est (oj + (bz — B)’o1,.) = D*/vp 


where D* has been adjusted by b; . Using this estimate of s4 an 
empirical estimate of the relative efficiency of Bartlett’s method, 
corresponding to the ratio of the first parts of (23): (24) multiplied by 
Fisher’s factor for the information from the extra degree of freedom, is: 


Asap aes co) : 
i+ Ga — DD * ro@ $38) ma 
This represents the ratio of variances of treatment contrasts as they 
would actually be computed in a particular case. However both si/sa 
and M/D may be rather erratic: for example, in example 1, si/sa is 
greater than the theoretical maximum for oj/04 < 1 owing to (bp — bz)” 
happening to be less than s;/D; and, in both examples 1 and 2, M/D is 
less than the expected ratio 1/(r — 1). Therefore (25), with o and a, 
estimated from the data, may be a more stable criterion to represent 
average efficiency for a given type of experiment. =. 
The relative discrepancy of expected mean squares (on the null 

hypothesis) in Anderson’s method is measured as equation (11)— 
equation (12) (substituting above estimate for qo;) divided by the error 
mean square as evaluated for the same method. 

Example 1. Bartlett (1937) illustrates on yields of a cotton experiment 
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with eye estimates of salt accumulation in the soil as concomitant 
variable. The relevant parts of the analysis of covariance are: 


8. Sq. and Prod. Re- 
Source d.f. |— —____,_————} duced b var(b) 

y? xy x M.Sq. 

Main 

treatments 5 32.210 | — 29.068 31.20 | 1.561 
Error D 10 | 97.807 | — 78.005 | 128.49 | 5.606 | —.6071 | .04363 
Error # 84 | 240.754 | —191.835 | 279.33 | 1.313 | —.6868 | .00470 

& = 4.268 Gz = 3.864 


The main- and sub-plot regressions are plainly similar and the sub-plot 
coefficient appears nine times more accurate than that for main plots. 
Comparison of results of the different methods is as follows: 


Reduced by 
d.f. bp Bartlett Anderson 

Main treatment mean 

square 5 1.561 1.405 1.368 
Main-plot error variance (os) (9) 5.606 (10) 5.127 (10) 5.101 
F .278 274 . 268 
Relative efficiency (26) 1.16 
Relative efficiency (25) 1.088 
Relative discrepancy of 

mean squares 0.96% 
Correlation of x?’s 0025 
Correlation of mean squares -00025 


Example 2. The following table gives the relevant parts of analysis of 


8. Sq. and Prod. Re- 
Source d.f.. |__| duced b var(b) 

y? ary x? M.Sq 

Main treatments 3 | 10,433 | 2,469 | 6,591 | 3,181 - 
Error D 9] 6,916 | 9,729 | 32,902] 504.9 | .2957 | .01535 
Error # 132 | 64,044 | 37,419 | 75,595 347.5 | .4950 | .00460 


oe ET 


& = 1.453 Gz = 6.384 
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covariance for a fertilizer experiment on orange trees at Riverside, 
California. It was observed for 12 successive years and years are 
treated as the sub-treatments. The concomitant observations are 
yields of paired check plots (pounds per tree per annum). 

The difference in error regression coefficients, .199 + .141, is non- 
significant and the accuracy of by appears more than three times that 
of bp . The tests for significance of fertilizer treatments appear as 
follows: 


Reduced by 
d.f. bp Bartlett Anderson 

Main treatment mean 

square 3 3181 3201 3199 
Main-plot error variance Koen) (8) 504.9 (9) 594.0 (9) 550.0 
F ; 6.30 5.39 5.82 
Relative efficiency (26) .949 
Relative efficiency (25) 1.095 
Relative discrepancy of 

mean squares 0.20% 
Correlation of x?7’s 013 
Correlation of mean squares .0023 


Example 3. Anderson’s (1946) example of a missing sub-plot gives 
relevant parts of the analysis of covariance as follows. In this method 
of analysis we can insert any arbitrary value for the missing plot, say 
Yo , and then the regression coefficient b,; estimates the difference between 
yo and the value which would be estimated in the usual way. To save 
recomputation we have imagined y, ‘guessed’ as Anderson’s estimated 
value, 763, so that sums of squares for y can be taken from his table: 
this of course leads to bz less than a half. 


Reduced a" 
Source dat, fe zy x M.8q. b var(b) 
Main treat- 
ments 3 | 1031784 | —118.7344 | 3/64 
Error D 9 80644 | — 41.3881 | 9/64 | 8562.27 | —293.89 | 60887 
Error £ 36 | 151950 | — 0.1875 | 36/64 | 4341.43 | — 0.33 | 7718 


The estimates of the missing plot from sub-plot and main-plot error | 


rows, namely 763 and 469 appear superficially different; but the difference 


a 
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294, only slightly exceeds its standard error, (68605)’7 = 262. The 
sub-plot estimate appears nearly eight times as accurate as the main- 
plot estimate. The alternative tests of significance for main-treatments 
work out as follows: 


Reduced by 
df. bp Bartlett Anderson 

Main treatment 

mean square 3 302396 343902 336192 
Main-plot error 

variance ( ) 1(8) 8562 (9) 8957 (9) 8687 
F 35.32 38.39 (9) 38.70 
Relative discrepancy 

of mean squares 0 —0.17% 
Correlation of x7’s 0046 .0046 
Correlation of mean 

squares .00083 . 00064 

SUMMARY 


The paper examines some methods which have been proposed for 
covariance adjustments in split-plot experiments when it may be 
assumed that the regressions appropriate to sub-plot and main-plot 
comparisons are equal. Theoretically the most efficient analysis 
should be given by estimating the regression coefficient by a weighted 
mean of those indicated by the main-plot error and sub-plot error rows 
of the analysis of covariance. But this would usually be too troublesome. 

Frequently the sub-plot error estimate is considerably the more 
accurate of the two and one may wish to use it for adjustment of main-, 
as well as of sub-treatments. When this is done two methods of testing 
the significance of main-treatments have been proposed. One was 
tentatively proposed by Anderson (1946) for analyses with missing 
sub-plots, but has been used by other workers for adjustment on an 
observed concomitant variable. There are theoretical objections: the 
expectations of treatment and error mean squares are not identically 
equal when the null hypothesis is true, the reduced sums of squares 
are distributed as the weighted sum of two x”s instead of as simple 
x’s, and they are correlated. The disturbances to the theoretical 
conditions for an exact F' test are however so small as to be trivial in 
practice. 

Bartlett (1937) proposed to adjust all main-plot sums of squares’ 
using the sub-plot regression as an arbitrary correction independently 


SPLIT-PLOT EXPERIMENTS 39 


evaluated. This is shown to lead to valid tests of significance provided 
the treatments have been randomized with respect to the concomitant 
variable. It is not theoretically valid for missing sub-plots. It does 
maintain equal expectations for the null mean squares; but, in this 
application, still has the other two defects of the Anderson method— 
again trivial. 

The gain in efficiency from using the sub-plot regression instead of 
the main-plot regression is about 9 per cent in two examples examined. 
Perhaps the chief advantage lies in maintaining consistency between 
adjusted sub-plot and main-plot means, thus simplifying presentation 
of results. When this is to be done the Bartlett procedure is recom- 
mended, being computationally simpler and having better theoretical 
validity. 

Although both split-plot experiments and experiments with co- 
variance are very common, we found it surprisingly difficult to find 
examples of split-plots with covariance. Out of nine found six showed 
significant differences between the main-plot and sub-plot regression 
coefficients, suggesting that to assume equality may not be as generally 
valid as one would expect. 


REFERENCES 


Anderson, R. L. (1946). Missing-plot techniques. Biometrics Bull., 2, 41-47. 

Barnard, G. A. (1950). On the Fisher-Behrens test. Biometrika, 37, 203-207. 

Bartlett, M.S. (1937). Some examples of statistical methods of research in agricul- 
ture and applied biology. J. Roy. Stat. Soc. Suppl., 4, 137-170. 

Cochran, W. G. (1940). Analysis of lattice and triple lattice experiments. II. 
Mathematical theory. Iowa Agr. Expt. Sta. Res. Bull., 281, 64-65. 

Cochran, W. G. (1946). Analysis of covariance. Univ. North Carolina, Inst. Stat. 
Mimeo. Series No. 6. 

Finney, D. J. (1946). Standard errors of yields adjusted for regression on an inde- 
pendent measurement. Biometrics Bull., 2, 53-55. ‘ 

Fisher, R. A. (1922). The goodness of fit of regression formulae and the distribution 
of regression coefficients. J. Roy. Stat. Soc., 85, 597-612. 

Fisher, R. A. (1935). The design of experiments. Oliver and Boyd: Edinburgh. 

Pearson, E. 8. (1937). Some aspects of the problem of randomization. Biometrika, 


29, 53-64. 
Smith, H. F. (1955a). Tests of significance in analysis of covariance and some related 
regression techniques. Zan 


Smith, H. F. (1955b). Variance components, finite populations and experimental 
inference. Univ. North Carolina, Inst. Stat. Mimeo Series. No. 135. MS 

Tang, P. C. (1938). The power function of the analysis of variance tests. Stat. Res. 
Mem., 2, 126-149. - 


A NOTE ON THE COMBINATION OF ESTIMATES OF 
RELATIVE POTENCY IN MULTIPLE ASSAYS* 


PameLtA M. CLARKE 


National Institute for Research in Dairying, Shinfield, England 


A problem of frequent occurrence in the field of biological assay 
is that of combining several estimates of the potency of a substance. 
Such estimates may be obtained, for example, on different occasions, 
or in different laboratories, or even by different methods, and generally 
they will have different variances, so that some form of weighted mean 
is usually required. Bliss (1952) and Finney (1952) have discussed 
methods of calculating suitable weighted means and their fiducial limits 
in various circumstances, and Bennett (1954) has given further ex- 
tensions of the theory. 

All these methods, however, are appropriate only when the individual 
estimates of potency are independent, a condition which is not fulfilled 
in certain cases of practical interest. For example, in an experiment 
planned to examine the sampling variability of the vitamin content of 
milk, several samples of milk were taken and assayed against the same 
standard preparation in a multiple assay. A combined estimate of the 
vitamin content of the milk was then required, with estimates of fiducial 
limits based on the variation from sample to sample, which happened to 
be appreciable. A similar situation might arise if samples of grass 
from different parts of a field were tested, again in a multiple assay, 
for oestrogen content, and if an overall estimate of the oestrogen 
content were also required. In such cases, multiple assay designs are 
economical in time and material, but since the sample estimates of 
relative potency are obtained by reference to the same set of results 
for the standard, they are not independent, so that new methods for 
obtaining the limits of error are required. 

The method will first be illustrated by a practical example, and 
this will be followed by an outline of the derivation of the formula. 


Numerical example 


The data for the example are taken from the results of a variability 
study, made in collaboration with Dr. M. E. Gregory, on the micro- 
biological assay of riboflavin in samples from the same bulk of cows’ 
milk, using Lactobacillus casei as test organism. Estimates of the 


*N. I, R. D. paper No. 1728. 
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potency of the individual samples were required in order to examine 
the variation between them, and this variation was to be taken into 
account in assessing the limits of error for the combined estimate of 
potency obtained from different samples. The complete experiment 
occupied 5 days, but one day’s results are sufficient here. 

Five samples of milk were assayed against a standard riboflavin 
solution at 4 dose-levels—1, 2,3 and4 ml. There were two replications, 
arranged in randomized blocks in the two halves of a wire basket, and 
each operation was carried out first on one block and then on the other, 
in the same order. Twice as many tubes were set up for the standard 
as for each milk sample. 

The individual observations are set out in Table 1, and Table 2 
shows the results of the analysis of variance. As expected, the linear 
regression on log dose was highly significant, and the usual validity 
tests for a parallel-line assay were satisfied, since the mean squares 
for deviations from a linear regression and for the interactions (doses 
x standard vy. milk) and (doses x milk samples) were not significant 
(P > 0.05). In fact the latter mean square approached significance at 


TABLE 1 


ASSAY OF VITAMIN By. PoTeNcy or 5 SAMPLES OF Cow’s MILK 
Individual results: 10% X log (optical density reading) 


_ Standard Milk samples 
Dose} —___—  -_ -___ 
(ml.)| Duplicate tubes} Total 1 2 3 + 5 Total 

Block 1 


Fe 

3 

4 857 857 1714 875 806 886 820 869 4256 
Total over both blocks 
1 
2 
3 
4 


| | | 
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TABLE 2 
Assay or VITAMIN By Potency or 5 SAMPLES OF Cows’ MILK 
Mean squares in the analysis of variance 


Source of variation Degrees of freedom Mean square 

Blocks 1 1 
Preparations 

Standard v. milk 1 1,064 

Between milk samples 4 14,069 = s,? 
Doses 

Linear regression 1 921 ,069 

Deviations 2 166 
Doses X preparations 

Doses X (standard v. milk) 3 120 

Doses X milk samples 12 263 
Blocks X (standard v. milk) 1 50 
Blocks X linear regression 1 927 
Residual 29 128 = s;? 


the 5% level, but the combined evidence of the 5 days’ results supported 
the assumption of the validity of the model. 

The highly significant (P < 0.001) mean square for differences 
between milk samples reflects the wide variation between the estimates 
of potency for the different samples, shown in Table 3. Further experi- 
mentation was carried out to investigate the reasons for this variation, 
but meanwhile a combined estimate of the relative potency of the milk 
and limits of error were required. Clearly the between-sample variation 
cannot be ignored in such a case, and it would be incorrect to follow 
the procedures outlined by Bliss (1952) or Finney (1952) since the five 
estimates of relative potency are not independent. Finney’s methods 


TABLE 3 
Assay or VITAMIN By Potency or 5 SAMPLES OF Cows’ MILK 
Sample estimates of relative potency 


Sample Relative 5% fiducial 
potency limits 

1 1.07 1.03, 1.12 

2 0.78 0.75, 0.82 

3 Weel Lee) 22 

4 0.83 0.80, 0.87 

5 1.01 0.97, 1.06 
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are in any case intended for use only if the sample estimates of relative 
potency are homogeneous. 

In the general case, let v be the number of samples of the test 
preparation, n be the number of replications for each test sample and 
n(p + 1) be the number of replications for the standard preparation. 
Let k denote the number of dose-levels for each material, with log 
doses denoted by 2, , 2. --- , 2. Finally, let Y; be the total response, 
over both preparations and all samples, at log dose x; , let Y7 be the 
total response to all doses of all test samples and let Y, be the total 
response to all doses of the standard. 

In this example, therefore, we have »v = 5,n = 2,p = 1,k = 4; 
Y, = 7240, Y, = 9589, Y; = 11086, Y, = 11984, Y, = 28389 and 
Ys = 11510. 

We first calculate [X] as k>ox; — (Dox,)’, giving [X] = 0.818. 

Then the mean slope b of the log dose/response line is given by 


b= {k i aY. — Da Yi} +o + DIX, 
= 567. 


Since M, the log relative potency, is given by 
M = {((p + 1) Yr — vYs}/nokb(p + 1), 


we have M = —0.017. 
The next step in the calculations is to evaluate g, which is given by 


g = kt’si/n(p +o + 1)[X]d’, 


where s; is the mean square shown in Table 2. The value of ¢ is obtained 
from standard tables at the required probability level, but a slight 
difficulty here is in deciding the appropriate number of degrees of 
freedom for entering the tables. This number lies between f, and 
(f: + fe), where f, and f, are the number of degrees of freedom for 
si and s3 respectively (see Table 2). Basing ¢ for the moment on f, , 
i.e. 4, degrees of freedom, g is found to be only 0.001, and since we 
have taken the largest possible value of ¢, g may safely be taken as zero 
in the remaining calculations. 

It is now possible to calculate the fiducial limits of the relative 
potency estimate. Since g is small, si, , the variance of M, is approxi- 
mately equal to (p + 1)s;/nvkb’, where ‘ea 


_ ost { kM? seal \ : 
bse GMX oetautl ost 
The numerical example gives p = 0,0228, so that the variance of M 
is approximately equal to 0.001119. 
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An estimate of the effective number of degrees of freedom, f, to 
ascribe to s4 , may be found quite simply from an approximate formula 
given by Cochran (1951): 

_ (o + 1% 
arr 

come 

Since in our example we have f, = 29 and f, = 4, this formula gives 
f = 4.2, and interpolation in standard tables gives a value of 2.7 for 
t. It may be noted that the value of f is close to f. when the ratio of 
s? to s? is high, as in this case; for a value of s3 nearer s; , i.e. when the 
sampling variation is less important, f is nearer f; + fo. 

From the formula M, , My = M = ts,,, the logarithms of the 5% 
fiducial limits of the relative potency estimate are thus found to be 
approximately —0.107 and 0.073, and the combined estimate of relative 
potency over all samples is therefore 0.96, with approximate 5% fiducial 
limits of 0.78 and 1.18. It is worth noting that had the between-sample 
variation not been taken into account, the fiducial limits would have 
been calculated to be 0.94 and 0.99, plainly under-estimates. 

In the full analysis of the experimental results, this procedure was 
followed to obtain estimates of M and sx for each of the 5 days, and 
these values were then used to test the homogeneity of the separate 
day estimates of relative potency. The variation from day to day 
proved to be greater than would be expected simply from a consideration 
of the variation between samples within days. Since, however, the 
relative potency estimates for different days were independent, the 
method given by Bliss (1949) could be used to obtain a weighted mean 
log potency value, and a standard error allowing for variation between 
days. 

The particular example given here, while demonstrating clearly 
the need to consider the variation between samples in work of this kind, 
may perhaps obscure the practical utility of the method presented in 
this paper for the combination of estimates of relative potency, because 
of the high ratio of between-sample to within-sample variance. As with 
the method for independent estimates of potency, given by Bliss, in such 
a case the estimates of the mean log potency and its variance are almost 
the same as would be obtained by direct calculation from the separate 
sample estimates of relative potency under the assumption of independ- 
ence. It is in the less extreme, and frequently occurring, cases that the 
method is most valuable, and it can, of course, be used over the whole 
range of values of the ratio of the two variance components, without 


any further assumptions as to the safety of using more approximate 
methods, 


ij 
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Derivation of formulae 


The formulae to be obtained in this section are more general than 
those quoted in the example, since it will not be assumed that g 1s 
negligible. The small modification of the method required when the 
replicates are not arranged in blocks will also be described. 

Under the assumptions for a parallel-line assay, the response at 
log dose x; of the standard in block j may be expressed as 


Ysii = a; + Biz; + € (SoM ei B89 ee ao 7), 


where a and 8 are constant for all observations in a block, and e is 
randomly and normally distributed with mean zero and variance o’. 

Similarly, for the response at log dose x; of the rth sample of the 
test preparation in block j we can take 


Yrs, = a; + Bs + Be: +e + € ris Lae ra 0), 


where u is the true log relative potency of the test preparation and 
e’, which is constant over any one sample, is normally distributed with 
mean zero and variance o”. 

The values of 6; are assumed to be normally distributed about a 
mean £. 

The expressions for 6 and M have already been given in the example. 
To find the fiducial limits of M, Fieller’s theorem (1940) may be applied, 
leading to the following equation: 


M t 
es ia) 
{lok My +0 +0 = OLX ts + vit 
nuk(p + 1)(p +o + 1)[X] 


ple orhiXe + Veet eh 
nok(p + I(p +0 + I)[X] , 


where g is as defined in the numerical example, but with si replaced 
by o°. In the same way as for the individual sample estimates of 
relative potency, the component of variance for the 6; does not occur 
in the expression for the fiducial limits. 

In practice, o” and o” are usually estimated from the observations 
themselves. If the design involves randomized blocks, the expected 
mean square for the differences between samples is («” + nko’), with_ 
(v — 1) degrees of freedom. An estimate of o”, with 


{(n — Ik + k — 3) + npk} 


degrees of freedom, is obtained by combining the mean squares for 
differences between duplicates within blocks and all interactions with 


M, ,My= 
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blocks except the interactions (blocks x standard v. test preparations) 
and (blocks x linear regression). If replications are not arranged in 
blocks, an estimate of (o” + nko’) is obtained in the same way as for 
designs with blocks, and an estimate of o, with 


k{(n — 1) + 1) + np} 


degrees of freedom, is given by the mean square for differences within 
each combination of preparation and dose-level. 

Let s? and s? denote the mean squares estimating o” and (o” + nko’) 
respectively, and let f, and f, denote the corresponding degrees of 
freedom. Then the equation for the fiducial limits may be written 


(i ee el Beet \ 
a a ae eam {ett : 


where p= “af kM” _ 4}. 
ror (GS) BAG oe ee 
The effective number of degrees of freedom, f, may be determined 
by Welch’s method (1947) as 


eyes (parle 
i a 2 1 a) 
ene 
{io Zee tare = 
~ but the simpler formula quoted in the example is usually adequate. 
Both formulae are approximate, and since Cochran’s formula gives a 
lower value, it leads to a conservative estimate of t. 

If g is not small enough to be negligible (i.e. is not less than about 
0.1), the estimation of g and f may be achieved by an iterative process, 
but this will rarely be necessary, since multiple assays usually occur 
as microbiological assays, in which case g is generally very small. 
When g is taken to be zero, the formulae simplify to those given in the 
example. The use of doses spaced at equal intervals on a logarithmic 
scale also simplifies the calculations. 

I am grateful to Mr. C. P. Cox for helpful suggestions. 


bo 
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MISSING AND “MIXED-UP” FREQUENCIES IN 
CONTINGENCY TABLES 


G. S. Watson 


Australian National University, Canberra, A. C. T. 


1. Introduction 


Missing and mixed-up values in experiments designed for analysis 
of variance cause but little trouble because the method of dealing with 
this type of data is well-known (see e.g. Cochran and Cox, 1950) and 
easily applied. The same problem may appear, though less often, in 
frequency data to be analysed by chi-square. For example, suppose 
that a botanist selects a random sample of a certain type of eucalypt 
in each of three rainfall belts. Each selected tree is classified as high, 
medium or low. The botanist intends to make a chi-square test, in 
the 3 X 3 table so obtained, that there is no association between height 
and rainfall. But when he comes to do the analysis he finds that, in 
the high rainfall belt, only the frequency of high trees is clearly desig- 
nated—the other two frequencies cannot be identified. How should 
he make his test? Alternatively, suppose that in his data the number 
of high trees in the high rainfall sample, and the total number of trees 
in that sample are missing. How should he make his test? 

The tests required are given below. The procedures for more 
complicated cases are also suggested. ist 


2. Missing frequencies 
Suppose first that the cell frequencies f;; in an r X ¢ contingency 
table are incomplete because f,, is missing. Denote the existing row 
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and column totals by R; (¢ = 1, --- , 7) and C; (j = 1, «+: , ¢) respec- 
tively and the total of the recorded frequencies by N. On the null 
hypothesis of no association, the observed cell frequencies are a sample 
N from a multinomial population with probabilities 


PiWi . F ee 
a ee } cl Is ee t, oe i! 
Tee (@ ey @, 7) # (1, D) 
where >-p; = 1, >oq; = 1. The maximum likelihood estimates of the 


p; and the q; are easily seen to satisfy the equations 


N 


Ri hh 

a Ne, 

Pi as ss Pigi a7 

a +r =0, (@ = 2,---,n”) 
and 

C; Np 

Ua nt en Gre. = 0, 

i: ta ree sh 

Cs ; 

ry +p =O, (j= 2, =" 50) 
Writing x for N#,g,/1 — p19, , it is clear that \ = —(N + 2) =u 
so that 


ieee Neto? Pe Nite (@ = 2,- 7); 
Pe wes ton yee C; 
LN lag: 15) aa Nace (j = 2, » ¢) 


These equations are of the same form as the equations when no data 
are missing, if x is interpreted as the missing value. Introducing the 
expressions for #, and , into the formula for z and solving the resulting 
quadratic, we find 


a RIG; 
een a () 
since the other solution x = —N may be ignored. If then z is calculated 


from this formula and added to the first row, first column and grand 
totals, the chi-square computed by the ordinary method receives no 
contribution from the cell (1,1) since 


hee (fi; + 2)(Cy +) 
N+ 2 


Indeed this is an intuitive way of computing z. 
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The resulting chi-square may be written algebraically as 


fc; a 
a (N + x)D +4; + x RN ag ) 


with (r — 1) (ec — 1) — 1 degrees of freedom. The general theory of 
chi-square tests of fit for the multinomial distribution (see e.g. Cramer 
1946) gives us the chi-square 


A A 
(4,7)#(,1) NB 54; 


with (r — 1) (c — 1) — 1 degrees of freedom. Since 
(N+ 2)1 — 6:4) =N 


these two expressions are identical so that the intuitive method is 
correct. This is not completely analogous to the situation in the 
analysis of variance; there the treatment sum of squares, calculated 
from the data including the missing value estimate, is biased although 
the error sum of squares is not, 

As with the analysis of variance, the correct analysis, when there 
are several missing frequencies, varies with their disposition. The 
above analysis is easily extended and shows that the correct procedure 
can be based on the missing value formula (1) and the ordinary method 
of computing chi-square. The formula (1) should be used iteratively 
to give estimates of all the missing values. The degrees of freedom of 
chi-square will be (r — 1) (ec — 1) less the number of missing frequencies. 


3. ‘“Mixed-up”’ frequencies. 


Suppose, for example, that in our r X c contingency table, the 
identity of f,;, and f,. is lost completely. Then the observed frequencies 
are, on the null hypothesis, a sample of N from a multinomial with 
probabilities ip 


t= 1; ras) 
DiGi + Pid 5 Didi Gedeampsae tay © 


If the loss is partial, we might be able to associate probabilities — 
Tigi + (1 — m)pige and (1 — m)pig + wigs with cells (1, 1) and 
(1, 2) where 7 is the strength of our belief that f,, belongs to cell (1, 1); 
Since in most practical cases our estimate of + would be vague, we take 

it to be 3; this gives the multinomial above. 
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Applying maximum likelihood as before, we find that 


A Ri “ 
Diag ore G1, Ao 8, 7) 


6 = (Cy Ee fi(Cr LM C3) 
sagen (OF mn OR Fey Evy 


6 = (C2 as fis (Ci ~ C2) 
7" €,4+¢, — fu — fr)N 


=H (Feo ee) 


The estimates are those which would be found intuitively. Chi-square 
must be calculated as for a multinomial with rc — 1 cells with proba- 
bilities as given above and it will have (r — 1) (ec — 1) — 1 degrees of 
freedom. 

The same method may be used when another two, or more than two, 
frequencies are ‘‘mixed-up’’. 
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CONTRIBUTIONS TO SIMULTANEOUS CONFIDENCE 
INTERVAL ESTIMATION* 


K. V. RAMACHANDRAN** 


Institute of Statistics 
University of North Carolina 


1. Summary. In this paper simultaneous confidence bounds for 
parameters in two different situations are given and examples are 
worked out to illustrate their uses. The present paper is primarily 
concerned with the calculation of simultaneous confidence intervals 
for specified confidence coefficient 1 — a. We are controlling the 
risk of error ‘‘experiment wise” in the sense of Tukey. Practical 
application of rules of estimation involving variable a would require 
additional tables. 


2. Statement of the Problems. 


(a) Letz;;@ = 1,2,---,k,j = 1,2, ---,2-+1) be samples of sizes 
(n + 1) from k independent normal populations with means u,; and 
variances o;(¢ = 1,2, ---,k). Wheno; = o' (i = 1, 2, --- , k), simul- 


taneous confidence bounds connected with the means y, are given 
[8, 9, 10]. In this paper we consider the problem of simultaneous 
confidence interval estimation on all ratios of the variances y;;- = 
(o3/o02-(i X73, 7’ = 1,2, ---, k). The problem of simultaneous confi- 
dence interval estimation for all contrasts of the log variances will not 
be discussed here. 

(b) In factorial experiments we are usually interested in estimating 
linear functions of treatment effects, whose estimates are independently 
distributed with a common variance. Suppose, for example, that we 
have observations from a ?@ factorial experiment with factors A, , 
A, , ::: A, at ¢ levels each and suppose that we are interested in 
simultaneously estimating the main effects only. We shall suppose 
that the experiment is so laid out that none of these is confounded in 
any replication. Let 6;(i = 1, 2, --- , p) denote the true main effects 
and let s? be an unbiased and independent estimate of the common 
(unknown) population variance o” based on q degrees of freedom 
(d.f.), (say, the error mean square in the analysis of variance). Let. = 
vi(i = 1, 2, --- , p) be the mean squares corresponding to the main 
effects of A;(i = 1, 2, --- , p). It is known that when the true main 


*Work sponsored by the Office of Naval Research under Contract NR 042 031 at Chapel Hill. 
**Present address: Department of Statistics, University of Baroda, India, 
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effects are zero, the v; are distributed as independent central chi-square 
variables with ¢ — 1d.f. each. In factorial experiments it is well known 
that the ¢ — 1 df. belonging to the main effects of A,(¢ = 1, 2, --- , p) 
can be split up into ¢ — 1 orthogonal components of 1 d.f. each. The 
most useful splitting up of the sum of squares due to the main effect 
of A;(i = 1, 2, --- , p) is into 1 df. for linear, quadratic, cubic, --- , 
effects. Also 6;(¢ = 1, 2, --- , p) can be split up in a similar way into 
t — 1 components corresponding to the t — 1 groups of the sums of 
squares of the main effects of A;(i = 1,2, ---, p). Letuwir,uie,---, 
eet be the respective | values such that > 521 wz;-= 6;(¢ = 1,2, ---, 7p). 
Also if #7, , ig, °** » Vice-1) denote the ¢ — 1 components of the sum of 
squares due to the main effects of A; , then )oj=1 v7; =v: (¢ = 1,2, ---, p). 
We shall consider the problem of simultaneous confidence bounds on all 
linear functions (of unit length) of the y;,’s. 


3. Solution. 
(a) Under the set-up given in 2(a), it is known that 


n+1 
“=| ¥ (vi; co a)*/et |, 
where 


— | Sawn + 0 | (@=1,2,--- 8 


has a chi-square distribution with n d.f. Also it is well known that 
Py = sou /Svoi(t #7374, = 1,2,-+--, kb), has an F distribution with 


(n, n) df. 
Also 
Ww Shee (1) 
implies 
2 2 2 
Sir Oj" Sel 
F ‘si Ss a; = s; (2) 
that is, 
CAM ee reek y 
Si. 2 a 2 i 8. (3) 
or 
2 2 2 
8; Gi Ey Ge s 
F's:, S Oy = ei (4) 
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Let W, be the intersection of the regions (1) for? # ON a Fed Rance 
Then clearly the necessary and sufficient condition for the sample point 
to lie in W, is that 

Re = vee < uaa (5) 
where 


Fr = (6) 


iZ 
gins) 
aan 
Alin 
~. bole. bo 
SA 
~) 
*. bl 
one 


Thus, if we set F’ = F’, , where F’{ is the upper a point of the distri- 
bution of F,,.x ratio with (n, n) d.f. and based on k variances [2], then 


2 2 2 
8; CO; see 
—|};,;- < ~ =7.:, < +“ / = = 
rl pis ae | a (7) 


forall i,7’=1,2,-:- ,k,ix@’ 


The associated test of the hypothesis that all the k variances are equal, 
is obtained by using as the region of acceptance 
sup (si/si:) = Frar = (Smax/Smin) < F’2 (8) 


ti’ 
7,4’ 


We have proved [6] that the associated test [3] is unbiased. 

(b) Under the set-up given in 2(b), it is well known that F; = 
[do§2i (ai; — ui;)’/(t — 1)s‘] has an ordinary F distribution with (¢ — 1, 
q) df. (¢ = 1,2, --- , p), where s; = Est. var (x;;). Hence using methods 
similar to those given in [8, 9], it is easy to check that a set of simul- 
taneous confidence bounds on all linear functions (of unit length) of 
wz; (for alli = 1, 2, --- , p) is given by 


t-1 t-—1 
oe G23; = Vb— uses >a Oj Mi; 
7 i (9) 
t-—1 
< ys A;Xi; + V(t a 1) 981 ) 
j=1 


where wu, is the upper a point of the Studentized largest chi-square 
[7]. Upper 5 per cent points of the Studentized largest chi-square are 
given in Table 1 for certain values of the parameters. In this situation 
we have proved [7] that the associated test has the monotonicity 
property. ote 

4. Examples to illustrate the use of 2(a) and (2(b). 


(a) Table 2* shows measurements of tensile strength x,;(¢ = 
es eS eee ee ee 
*Data taken from [4] page 38. 
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TABLE 1 
Upper 5% points of u when t = 3 and for different values of p and ¢ 


a p 1 2 3 4 5 6 7 8 
TEN 
ns ————— 
5 5.79 | 7.88 9.24 10.26 11.08 11.76 12°35 12.87 
6 5.14 | 6.90 8:03 8.88 9.56 10.12 10.61 11.04 
7 4.74 | 6.28 if Dat 8.01 8.60 9.09 9.51 9.88 
8 4.46 | 5.86 6.75 7.42 7.95 8.39 Saat 9.11 
10 4.10 | 5.32 6.09 6.66 4.12 fw) 7.83 8.11 
12 3.89 | 4.99 5.69 6.21 6.62 6.96 (ees (sok 
16 3.63 | 4.62 5.23 5.68 6.04 6.33 6.59 6.81 
20 3.49 | 4.41 4.98 5.39 ere 5.98 6.22 6.42 
24 3.40 | 4.29 4.82 5.20 5.51 5.76 5.98 6.17 
co 3.00 | 3.69 4.08 4.36 4.58 4.76 4.92 5.05 
1,2,---,5;7 = 1, 2, --- , 6) made on 6 specimens of rubber randomly 


selected from each of 5 different batches. 
We assume that the 2,;’s are from normal populations with means 
uw; and variance o; . 


TABLE 2 
Measurements of tensile strength (in kg/cm?) of specimens of rubber. 


Batch number 


Specimen a ee ee ee eee 
number 4=1 4=2 +=3 4=4 41=5 
pe Ties 116 170 181 EF 
4) 172 179 156 190 186 
aS 137 182 188 210 199 
j=4 196 143 212 173 202 
4=5 145 156 164 172 204 
7=6 168 174 184 187 198 
Mean &; 165.8 158.3 179.0 185.5 194.3 
Mean square s7 468.6 653.1 406.0 196.3 ITS 


Simultaneous confidence bounds for all ratios of the variances, y;; = 
oi/oy (i # 57,7 = 1,2, --- , 5) will be obtained by using (7). 

Now the upper 5 percent value of Fax With k = 5, = 5 is 16.3. 
Hence substituting in (7) we get, with probability .95 


CONFIDENCE INTERVALS 55 


0440 < yi2 < 11.6953 ° 
) 0708 < vis < 18.8135 
| 1464 < yin < 38.9114 

.2578 < yis < 68.5040 

0987 < vos < 26.2202 

2041 < you < 54.2317 

3593 < yo, < 95.4756 

1269 < a4 < 33.7133 

12234 < 35 < 59.3532 


-1080 < ys5 < 28.6962 


(b) ‘Table 3* gives the plan and yields of beans in pounds of a 25 
factorial experiment conducted by the Rothamsted Experimental 
Station in 1936. 


(10) 


TABLE 3_ eee 


Rep II 
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The‘four main effects D, N, P, K are given below 
“1 = —1.00, Loi = —12.75 
C3 — 13/5; Ct 1.50 


(11) 


Suppose we are interested in simultaneously estimating the four main 
effects only. The error sum of squares (in the analysis of variance) 
with 14 d.f. is 340.0. Hence s* = 340.0/14 = 24.29 with g = 14 df. 
Now Est. var (z;;) = 24.29/2 = 12.14 (i, 7 = 1, 2, 3, 4) and upper 5 
per cent point of ~/u is 2.84 (obtained by interpolation in Table ITI [5)). 
Hence using (9) we have with a probability = .95 

— 15S 0a 8.89 

— 22.64 Man — 2.86 

— 814 < us, < 11.64 


IA 


lA 
IA 


(12) 


- Notice that wi, , or , Msi » Ma: are respectively the true main effects of 
DON OPK: 

5. Acknowledgement. I wish to acknowledge my indebtedness to 
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RANDOM GENETIC DRIFT IN A TRI-ALLELIC LOCUS; 
EXACT SOLUTION WITH A CONTINUOUS MODEL* 


Moroo Kimura 


Department of Genetics, University of Wisconsin 


1. Introduction 


Random genetic drift is a stochastic process of change in gene 
frequency in finite populations due to random sampling of gametes in 
reproduction. Since the pioneering works by Fisher (1922, 1930) and 
Wright (1931), much attention has been paid to this phenomenon and 
many theoretical as well as experimental studies have been attempted. 
A brief review on this topic will be found in a previous paper (Kimura 
1955b). The evolutionary significance of random drift is still in dispute 
(Fisher and Ford 1947, 1950; Wright 1948, 1951), and decisive evidence 
for any conclusion is still missing. Haldane (1954) has suggested an 
analysis of frequencies of antigenic characters among neighboring 
populations for this purpose. Recently, Glass (1954) reviewed some 
evidence for the operation of random drift in human populations. From 
the genetical point of view, it is highly probable that there exists a class 
of genes so nearly neutral in selective value that random genetic drift 
plays a prominent role in determining the local differentiation of the 
gene frequencies. The best examples are to be found in certain isoalleles 
in Drosophila and other organisms. From the standpoint of math- 
ematical genetics, the problem of random drift provides an area where 
the theory of Markov processes finds important applications (Feller 
1951, Crow and Kimura 1955). 

in a recent paper the present author reported a complete solution 
of the process for the case of a pair of alleles (Kimura 1955a). With 
multiple alleles the problem becomes more difficult, and the solution — 
even for three alleles (Kimura 1955b) contains a function C,(z, y) 
and only the first three terms in the expansion are given explicitly. The 


*Paper No. 492 from the Department of Genetics, University of Wisconsin. Also Contribution 
No. 113 of the National Institute of Genetics, Mishima-shi, Japan. 
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purpose of the present paper is to give the exact solution for a triallelic 
locus. First I shall summarize briefly the previous asymptotic results 
and then show how the exact solution may be obtained by the use of 
partial differential equations. 


2. The asymptotic formulae obtained by the calculation of moments of the 
distribution 


Consider a randomly mating population of effective size N. Let 
v,, y, and z,(=1 —2x, — y,) be the frequencies of alleles A, , A, and A; 


respectively in the {th generation. We denote by u’,', the m, nth moment 


of the distribution about zero at the ith generation, such that y’,°) = 
E(zx"y:). In natural populations the number of individuals is usually 
large and we consider the case where N is sufficiently large that 1/N” 
may be neglected. Due to the random sampling of gametes in repro- 
duction, the moments change gradually from generation to generation 
and, with suitable assumptions, we can derive the following infinite 


system of differential equations: 


lee a _(m hn) (mb eh) G4) mim — 1) yc) 


dt ae 4N Mim,n are 4N Mm-i,n (1) 
v OO "2 aa (m, (a= A; 2, 3, < ) 


If the initial frequencies of A, , A, and A; in the population are p, q 
7 (0) 


and r respectively (p + ¢ + r = 1), we have u’,,., = pg" as the initial 
condition of (1). It can be shown that (1) has the solution of the form: 


m+n—1 Oy gd 
ie =e Dad eae exp eee ‘ (m, WOR Se 1; 2, 3, fe =) (2) 
i=1 

where the C’.°>’s are constants. From this we can obtain the various 
probability distributions of the gene frequencies. The most important 
one is the joint distribution of the frequencies of A, and A, (and, 
therefore, of A; also) in the population which contains the three alleles. 
We denote by ¢(z, y | p, q; t) the density of the conditional probability 
that the frequency of A, lies between x and x + dz and that of A, lies 
between y and y + dy in the tth generation (0 <a <2+y < 1), 
~ given that they started from x = pandy = gatt=0(<p<pt+q 
5 1). As was shown previously (Kimura 1955b) ¢ must have the 
orm; 


o(t,y |p, 4g; t) = > Cie, ) exp ee i. (3) 


RANDOM GENETIC DRIFT 59 


The coefficients C;(x, y)’s are functions of « and y only and can be 
: Mee (4) 5 ° ‘ 

obtained from C,,',’s. The calculations involved, however, are so 

tedious that only the first few coefficients have been obtained; 


Ci(x, y) = 5!pqr, 


7! 
ce, 9 = Hoo (p — $e + (0- 2v-+ (— 2} 


: 1 9! : 2 3 3\ 5 2 3 3 2 
tats eS. Ls ie u 
+ (PS 4 Se + a(n - PEL 4 Tey 
ms 3(pr = pat ~b 1 ee _ 3( a — it a La, 


where r = 1 —~ p~ qandz = 1—z2 — y. Since for large ¢ the ex- 
ponential terms decrease rapidly as 7 increases, only the first few terms 
in (8) are important and we obtain the asymptotic formula; 


o(x, y |p, 9; ) ~ Cilz, y) exp = = Col, y) exp {- ik ‘ (4) 


(t> @) 


Thus the final rate of decay of the distribution surface is 3/2N per 
generation. The probability Q{° that all three alleles still co-exist 
in the population at the ‘th generation is obtained by 


1 l-y 
Tass -[ [ g(x, y | p, 9; t) dx dy; 


a ® 
Qf = 60p¢r exp {— ‘ 
+ 90pqr{7(p* + ¢° + 1°) — 3} exp \-22 io ack 


This result can also be obtained by other methods. Detailed derivation 
of these formulas and graphical illustrations of the distribution surfaces 
are given in a previous paper (Kimura 1955b). 


3. Solution by partial differential equations 


The method of partial differential equations has been used for the 
study of random drift by Fisher (1922, 1930) and Wright (1945) and 
has proved to be a very powerful tool (Kimura 1955a). This method 
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can be extended to the case of multiple alleles and multiple loci (Crow 
and Kimura 1955). With three alleles at a single locus the equation 
is written in the form; 


2 2 
ce aie {Wszsyb} —+- 


if 
ss 3 { V3,0} 
a (6) 
0 o) 
== ——{M 

ar {M;.o} oy { suP} 5 
where 5x and éy are the rate of change in x and y per generation. In 
this equation V, W and M denote respectively the variance, covariance 
and the mean of the quantities identified by subscripts. In the present 
case 


1 — : iy) 
Von ae zt oN 2) ) W szbu = 22 Vey = eereee 


and 
Ms. — M;, — 03 
Therefore equation (6) becomes 


Ob _ see = piped: 
af — 4N ag3 (71 — DO} — ON ap ay ay (xyd} 


ee 
+ GN ay? yl 9b} oe NO oar ya) 


(7) 


We start from a population in which the frequencies of A, and A, 
are p and q respectively. Therefore the initial condition is 


_ o(@, y |p, 9; 0) = dx — p)- dy — Q), Ce) 


where 6(x) represents Dirac’s delta function. The equation (7) has 
singularities at the boundaries and no arbitrary conditions can be 
imposed there. 


To solve (7), we seek solutions of the following form as suggested in 


(3): 
o = Ze™, (8) 
where Z is.a function of « and y but not of ¢. is given by 


, G+ DE+2) 
4N ? 


where 7 is a positive integer. 
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By substituting (8) into (7) and transforming the independent 
variables by 


= pl —£) and y = pt (= pe 1); 
(7) becomes 
Peel — 2 OF, 21 — 2 
aes (9) 
OZ : ’ 
+ 2(2 rh eee a Me NG+4Z = 


The above transformation makes it possible to apply the standard 
separation technique to solve the equation. Let 


Z=R-0, 


where RF is a function of p only and 0 is a function of £ only. By this 
substitution (9) can be changed into 


PU — ») SE + 202 - 30) sage i 


(11) 
1d0 
6 dé 


By assumption, F is a function of p only and hence the left side of (11) 
depends only on p, while @ is a function of ~ only and hence the right 
side of (11) depends only on &. It follows then that both sides of (11) 
must equal a constant which we shall designate by K. Thus (11) can 
be separated into the two equations; 


es) aR + 292 - 3) + (@— NG +p — KIR=0 (12) 


ae ee : 
Ei 6) det 2(1 — 2) et Ko = 0. G3) 

First we identify (13) as the hypergeometric equation; 
Hie) Oye ot 6 -1)5]0" —a80' =10" (14) . 


where - 


y= 25 a+pBp=3 and of = —K, 


in this case. 
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had 


Therefore we take 


3+ V9+4K _3- V9+4K 
a= oid oe ee 


Though we can not impose an arbitrary condition at the boundaries, we 
do want a solution which is finite at these singular points (£ = 0 and 1). 
Among the two independent solutions of (14), only one of them i.e. 
F(a, 8, 2, £) is finite at £ = 0 in this case. In order to find the condition 
which makes F(a, 8, 2, £) finite at the other singularity (£ = 1), we 
note the following relation: 

F(a, B, 2, &) cal ae ml +a+6,1 =e) 


BQ) Bia 2) ae aeeer it, - Se ee hy 
shi Seyay tm Cl igen 
Noting that a + 6 = 3, we see that in order that lim;:., F(a, 8, 2, é) 
be finite, 2 — a must be a negative integer and 6 must be 0 or a negative 
integer. Thus the only possible values of K are expressed by 


K = (m — 1)(m + 2), 


where the m’s are positive integers (m = 1, 2, 8, ---). Corresponding 
to this eigen value K, we have a = m + 2, 8 = 1 — m, and if we put 
£ = (1 — 6)/2, we then have 
G= F(m + 2,1 = m2), 

except that it may be multiplied by a constant. 

It is convenient to express 0 in terms of the Gegenbauer polynomial 
T,.-1(0) which is defined by 

' 1 = 
T_(6) = mtn 1) F(m Up iter mee: 1= 8), (15) 

It is known that the Gegenbauer polynomials 77(0)(n = 0, 1, 2, -+-) 
form a complete orthogonal system for the intervals —1 < 6 < 1 with 
the density function (1 — 6°), and we have the following normalization 
integral (see Morse and Feshbach 1953, p. 782-783). 


Re eats ans hone Ss 2(n + 2)(n + 1) 
[0 - OMor@d = 1.2 Feet), Go) 


where 6,,," 18 the Kronecker delta function. It is also worth noting 
that 7,,(0) is finite at £ = 1 for finite n. 
Next, we must reduce the equation (12) to a manageable form. For 
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this purpose we put 


ioe pes: Ui, 
Then (12) reduces to the hypergeometric equation: 
d°’U dU 
1— p) —> {2 1) — (Qm = 
p( a a a (m + 1) — (2m + 4)p} °F (17) 


— (m — i)(m +74 3)U = 0. 
This gives us the Jacobi polynomial as a pertinent solution; 
U = J;_n(2m + 3, 2m + 2, p). (¢ = m,m+1, --:) (18) 
Here the Jacobi polynomial is defined by 
J,(a,c, p) = Fla +n, —n,c, p). 


It is known that {J,} n = 0, 1, 2, --- form a complete orthogonal 
system for the interval 0 < p < 1 with the density function 2°” 
(1 — x)* *. (Morse and Feshbach 1953, p. 780-781). 

Combining all the above results, we express the solution in the form 


o(z,y|P,95) = 2) DP C(m— 1,4 — m)p™"F,_n(2m + 3, 2m + 2, p) 


x 10) exp {EF DE + 2 


or by putting m — 1 = nandi —m=j, 


g(x,y |p, 95%) = 2 2 C(a, je"Ji(Qn + 5, 2n + 4, p) 


(19) 
Saree 4 oe ue ‘ 
where the C(n, 7)’s are constants and 
iss SE A ip machy. (20) 


a+ty’ 


Now we use the initial condition (7’) to determine C(n, J). 
From (19), we have (putting ¢ = 0) 


ate — p)-8y — @ = VDI An + 5,2n+ 4, NTO) (21) 


n=0 


We next multiply both sides of (21) by p"**(1 — p) Ji(2n + 5, 2n + 4, p) 
(1 — 6) T..(8) and integrate over0 <p <1,-1<560< 1. If we 
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use the orthogonality relations 
/ (1 — p)J,(Qn + 5, 2n + 4, pJr(2n + 5, 2n + 4, p) dp 
; (22) 


jG + DMQn + 3)!/ Bex 
~ (f+ Qn + 8) j + 2n + 4127 + Qn + 5) 


and (16), the right side becomes 


py Zeke + Mn’ + Wn’ + 2)-Qn! + 2)IQn' + 3)! 
in’, f) (k + Qn? + 3) Nk + Qn’ + 412k + Qn’ + 5) 


If we notice that the Jacobian of the transformation (20) is 


0(0, p) _~ 2 
Oy) Saye 


the left side becomes 


Sp¢p +g" (1 —p— OTn (pot 2) Jn + 5,2n’ + 4,p+ Q). 
Therefore 


Ap en Og pe Ott 44) a 8) 


COD = AGE Tia + De + 2)-Qn + Din +3)! 


(23) 
x part — yr(2—4 2) 52m + 5, 2n+4,1-—-—7r), 


wherer = 1 — p —q. 
We then write the final result in the form; 


ia, u |p, at) = DY lm, pl — 2" 


x r(2@—) 7 ,0n + 5,2n + 4,12). (24) 


x op {Etat DG tots } 
where z = 1 — x — y. The functions T;(-) and J;(-, -, -) are re- 
spectively the Gegenbauer and Jacobi polynomials as defined above. 
It is not hard to verify that (24) generates the asymptotic formula 
given in (4), if we notice that 
T(6) = 1, Ti(0) = 36, --- 


JolG; C; py = 1, J (8,4) Sa es 
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Also it should not be difficult to prove the uniform convergence of the 


series (24) for ¢ > 0, since the exponential terms decrease very rapidly 
for large n and j. 


4. Extension to more than three alleles 


The important point in the above treatment is that the method 
can be extended to a larger number of alleles. 


Thus with four alleles, say A, , A, , A; and A, whose frequencies 
are x, y, zand wu respectively (x + y + 2+ u = 1), we have the following 
partial differential equation; 


1 1 1 
%: = Gy (t — a)b}.2 + an (YA — wb}uv + an 2 — 2)b}e0 
(25) 
2 2 2 
~~ 4N {xy} oy a 4N {y2} v2 ae 4N {xe} x2 

The transformations 

x= pt 

y = ol — dn Smee ABE heen 

2 pli. —-£) Lo) 9) 


reduce the above equation to: 


Ais. 5) nd — ) 
p(1 = p) Po, =, p ae a a, pi. 5) Pig 2(3 — 4p)®, 
2(1 — 38) 2(1 — 2n) aeriaaes iS 
a p ®; Ge yal =e é) &, + CG I)(i ie 6)® 0. 


Carrying out a similar but somewhat more complicated procedure we 
can separate this into three equations for each of the three variables 
p,£and 7. The final solution is then expressed as a linear combination of 
the products of the solutions of these component equations and an 
exponential term for t. The details will not be given here. 

The above argument will be enough to suggest the techniques by 
which the general case of an arbitrary number of alleles can be solved. 
However, additional techniques will be needed to make the mathe- 
matical manipulations manageable. 


5. Summary 

The exact solution for the process of random genetic drift in a 
triallelic locus has been obtained by solving the partial differential 
(Kolmogorov) equation (7) based on a continuous model. 
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The probability distribution of gene frequencies in the unfixed 
classes where all the three alleles coexist (24) indicates that the distri- 
bution surface finally becomes flat and decreases in height at the rate 
of 3/(2N) per generation as opposed to 1/(2N) for a pair of alleles. This 
confirms the asymptotic solution (4) previously obtained by another 
method. 

The applicability of the present method to cases with more than 
three alleles has been discussed. The biological implications of the 
problem have been considered in detail elsewhere (Kimura, 1955b). 
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MULTIVARIATE ANALYSIS AND AGRICULTURAL 
EXPERIMENTS 


D. J. FINNEY 


Agricultural Research Council Unit of Statistics, 
University of Aberdeen 


The aim of statistical science must always be to aid the research 
worker in making the best possible use of his efforts and his results; 
one important function for Biometrics is to provide a forum for the 
exchange of opinions on how this aim can be achieved in the biological 
sciences. Amongst the many papers on statistical science published 
today, some appear to find new outlets for mathematical theory without 
materially assisting scientific research. In recent years, I have been 
particularly aware of papers of this kind on multivariate analysis. 
Statisticians evidently hold widely divergent views on the practical 
importance of various types of multivariate analysis: I suggest that 
we need to examine carefully their relevance to the interpretation of 
experimental and observational data. 

In this note, I propose to be severely critical of the use of multi- 
variate analysis of variance and the construction of canonical variates 
in the analysis and interpretation of agricultural and other experiments. 
My argument can best be presented by reference to a particular example, 
and I therefore discuss in detail the recent paper by R. G. D. Steel 
(1955). Of course I intend no personal attack on Dr. Steel’s work, 
but his interesting paper happens to illustrate my criticisms especially 
simply and clearly. Employment of the methods he describes appears 
to be increasing, and other applications that are, in my view, equally 
unfortunate can be found (e.g., Dutton, 1954; Quenouille, 1950). 
Questions of choice of method are often less simple than an ardent 
partisan would have them: I look forward to having my own outlook 
criticized as severely as I criticize that of Dr. Steel and others, for 
clear thinking about the ultimate objectives of statistical analysis 
is more important than the vindication of a particular point of view. 

Steel has discussed an experiment comparing the yields of 25 varieties 
of alfalfa, grown on the same four randomized blocks in 1949 and 1950. 
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He has given an excellent account of the numerical operations involved 
in a bivariate canonical analysis on the pairs of yields from each plot, 
and has shown the essential simplicity of calculations that sometimes 
terrify when expressed purely mathematically. Unfortunately, he 
has not made clear the aim of the experiment and the manner in which 
his analysis enables the agricultural scientist to reach conclusions 
relevant to his problem that could not have been obtained by simpler 
alternatives. He implies that such methods should be more generally 
adopted for experiments whose yields are recorded in several successive 
years, or in which more diverse multiple observations are made on each 
plot; I believe this to be a dangerously misleading policy. 

The choice of a multivariate canonical analysis for experiments 
such as that discussed by Steel perhaps originates in a belief that the 
primary purpose of statistical analysis there is the making of tests of 
significance. This is not so. In the alfalfa experiment, a general test 
of the significance of differences between varieties is of practically no 
interest. Personally I should be surprised if a good experiment on as 
many as 25 varieties of a crop failed to show significant differences 
between them. If it did not, I should suspect inadequate replication, 
unexpectedly large variation by comparison with the general habit of 
the crop, or careless management of the experiment, all of which are 
features of the circumstances of the experiment and not intrinsic 
characteristics of the varieties. It is surely almost inconceivable that 
25 varieties should be distinguishable by some characteristics yet should 
have identical yield parameters: indeed, I suspect that any two recog- 
nizably distinct varieties could be shown as differing “significantly”’ 
in yield by sufficient increase in replication or by test over a sufficient 
number of years. Absence of a significant result in a particular ex- 
periment is usually a commentary on the experimental technique 
rather than an addition to knowledge of the varieties. 

The chief value of the analysis of variance in a variety trial is to 
determine an error variance, whence are derived assessments of standard 
errors for differences between pairs of varieties or groups of varieties. 
These standard errors are important in further stages of a selection 
programme or in the formulation of practical recommendations, since 
they provide a basis for judging whether certain differences are large 
~ enough to make particular courses of action desirable. For this purpose, 
clearly, the variates analysed—or at any rate the variates whose mean 
values are finally summarized and discussed—must be those measures 
of crop performance that the investigator considers relevant to the 
judgements he must make. They may be the basic measurements 
(e.g. weights) made on the crop or they may be derived quantities 
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(combinations of yields in several years, or starch equivalent, or per- 
centage dry matter), but they can be defined only by someone who 
knows the purpose of the experiment. They cannot be deduced purely 
from statistical analysis of the numerical values of the basic measure- 
ments. For experiments on a plant that has two distinct possible 
uses (e.g. flax-linseed), the analyses that are important will depend 
upon which use the investigator has in mind. Again, in the alfalfa 
experiment, records of plant height or of leaf area might have shown 
relatively greater differences between varieties than did the yields, 
but canonical variates including them also would probably have been 
less helpful to the investigator (though more highly significant in their 
differences) than those obtained from yields alone.* 

Had I been asked to analyse the alfalfa experiment, I should certainly 
have made an analysis of variance of the total yields over the two 
years, as this total production is likely to be the chief consideration in 
deciding the relative merits of varieties. I should also have analysed 
the difference in yield between the two years, as an aid to comparing 
varieties in respect of seasonal variability in yield, a factor that might 
be important in preferring one variety-to another. The fact that any 
tests of significance in the two analyses might not be independent of 
one another would not worry me: the nature of the experiment demands 
consideration of these two combinations of yield. Nevertheless, Steel’s 
Table 4 suggests that any correlation between these two combinations 
(arising from unequal variances in 1949 and 1950) is quite trivial. In 
some circumstances, I might want analyses of yields in the two years 
separately, instead of or in addition to analyses of the total and difference, 
although for a forage crop grown on the same land for several years 
these will usually be less interesting.** If I were told that one ton of 
alfalfa in 1949 was twice as valuable as one ton in 1950, I might wish 
also to make an analysis of a new variate formed as twice the 1949 
yield plus the 1950 yield, although it is difficult to see what importance 
this could have in assessing the general merits of the varieties unless 
the greater value per unit weight in 1949 were a characteristic of the 
first-year growth of alfalfa rather than a consequence of particular 
economic conditions in that year. 

I cannot think of any circumstances in which the investigator 
would be interested in Steel’s two canonical variates, or in other 


_ *He might wish to take account of these variates also, if height or leaf area had any economic or 
other importance comparable with that of yield, but he would certainly adopt his own weighting of the 
relative importance, not that represented by a canonical variate. 

**Of course, the complete analysis of variance and covariance shown in Steel’s Table 2 is a conven- 
ient computing procedure if more than two analyses of linear functions of the yields are required. 
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combinations of yields for the two years chosen entirely on the internal 
statistical evidence of the experiment. Admittedly the first of these 
variates is an optimal linear discriminant function between varieties, 
but for two reasons such a discriminant is without importance here. 
First, a function whose components are yields in two particular past 
years could not be used for discrimination in the future: the appropriate 
components, yield in 1949 and yield in 1950, can never be measured 
again! Secondly, the 25 varieties under test can presumably best be 
discriminated by whatever characteristics (habit of growth, leaf form, 
flower colour, etc.) identify them as varieties to the plant breeder, and 
no one would suggest that a character as variable in its manifestation 
as yield should be employed for the purpose! 

It happens that, in the alfalfa experiment, the sum and difference 
of the yields are almost as effective discriminators between varieties as 
the two canonical variates. My objection to the latter, however, is 
not that in this instance they could be replaced by something simpler 
without loss but that, whatever the numerical results of this experiment 
had been, their values would be of no interest to the experimenter 
whereas the sum and difference would always have been important. 
The mean values shown in Table 7 may represent a mathematical 
simplicity of structure, but as they stand they are not the slightest 
help to the investigator who wishes to know which varieties to subject 
to further more refined comparison or which varieties to release for 
commercial use. Steel claims that ‘This analysis helps locate varieties 
that are consistently good (poor) but sometimes do even better (poorer) 
than expected and ones that are good (poor) in some years and not 
exceptional or are even poor (good) in others’. The terms good and 
poor, however, are meaningful only with reference to factors outside 
the experiment, concerned with the use to which the produce is to be 
put: no internal statistical analysis of multiple measurements on each 
plot can produce a function that is a measure of “goodness’’. 

Steel suggests that the property of independence possessed by the 
means in his Table 7 ‘is useful in making exact probability statements’. 
My contention is that inexact or non-independent probability state- 
ments about quantities that are meaningful to the interpretation of the 
experiment are far preferable to exact statements about quantities 
_ that are only mathematical abstractions. To have exact statements 
about the right quantities would be better still. Possibly the theo- 
retical exactness of simultaneous fiducial or confidence statements 
about the various comparisons of totals and differences of yields that 
are important to the research worker could be improved by first formu- 
lating such statements for independent canonical variates and then 
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converting these into statements about the variates required. The 
tests of significance in the canonical analysis would still be unimportant, _ 
but that analysis might prove to be a convenient computing technique 
for obtaining what was ultimately wanted. In the alfalfa experiment, 
I am sure that nothing of practical importance would have been gained 
by such refinements, but the possibility may merit further study as a 
realistic application of multivariate theory to plant breeders’ problems. 
Its elaboration can be left to those who are expert in that theory: it 
might, for all I know, follow as a fairly easy development of the work 
of Tukey and others on simulataneous interval estimation. My present 
concern is to persuade biometricians that, without such interpretation 
in terms of estimation instead of significance testing, in field experiments 
and in many other research problems the type of multivariate analysis 
illustrated by Steel is usually inappropriate and often actively mis- 
leading. 
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THE RELATION BETWEEN QUANTAL AND GRADED 
RESPONSES TO DRUGS 


P. 8. Hewierr anv R. L. PLackettr 


Pest Infestation Laboratory, Slough, Bucks, and 
Department of Applied Mathematics, University of Liverpool 


1. A curious dichotomy exists in the study of biological responses 
to drugs. Responses, justifiably enough, are regarded as of two distinct 
types—quantal and graded. Quantal responses are those which classify 
an organism or other unit of biological material as having responded 
or not; for example, death, paralysis, etc. A graded response is such 
that the single organism gives a response in quantitative terms; for 
example, a change in weight, or a change in blood pressure. The 
statistical treatments of the two types of response show some similarities, 
largely because regression techniques are used for both; but biologically 
speaking the quantitative descriptions of the two types of response 
have been kept rigidly separate. No attempt seems to have been made 
to discover any connection between the dose-response relationships for 
quantal responses on the one hand and those for graded responses on 
the other. 

2. The purpose here is to view the events following the strictly 
controlled administration of a drug in such a way that the probable 
interrelationships of succeeding graded and quantal responses can be 
seen and formulated quantitatively. First, however, it is necessary 
to comment on the interpretation of functions relating quantal re- 
sponses to dose. 

3. We shall, as is commonly done, assume that the relation between 
a quantal response and dose represents a cumulative distribution of 
tolerances, where the tolerance of an individual organism (or other 
unit of biological material) is the dose of drug just insufficient to make 
it show the quantal response concerned. It is fitting that a concept 

such as that of a tolerance should be scrutinized from time to time: 
Berkson (1951) has in fact challenged the concept, but he proposed 
no alternative interpretation. He cited an experiment in which indi- 
vidual men reacted differently on different occasions; but an interpre- 
tation in terms of tolerances of the results of dosage on a certain occasion 
does not depend upon the tolerance of each organism remaining constant 
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in time. There is, indeed, experimental evidence that the response-time 
of an individual organism to a given dose of drug can vary from one 
occasion to another (Bliss & Beard, 1953), and by analogy tolerances 
might do the same. Berkson’s other example concerned X-rays, which 
are outside our scope. However, he ridiculed the idea of tolerance by 
comparing it to “the resistance of the targets to getting hit in the 
bull’s-eye”’; but such an interpretation might be highly suitable if the 
size of the bull’s-eye varied from target to target. 

If the experimental conditions are rigorously controlled, and if one 
organism shows a quantal response to a drug whereas another treated 
in the same way does not, it seems reasonable to attribute the difference 
to an inherent difference between the organisms during the period of 
action of the drug. The concept of tolerance can be compared to that 
of weight. Ata given time the weights of individuals of a species may 
differ, and the weight of each may alter with time; but the fact that a 
set of individuals may even be ordered differently by weight on different 
occasions does not invalidate the supposition that an individual has a 
definite weight at a certain moment. We conceive a tolerance, if such 
exist, to be an inherent characteristic of an organism, just as its weight 
is, though of course a tolerance might show temporal variation about 
its general level relatively greater than does weight. On the evidence 
at present available we see no reason to reject the concept of tolerance 
as a working hypothesis. 

4. It seems certain that quantitative changes of some kind always 
accompany the action of a drug, even if most of these changes are not 
so readily observable as the graded responses commonly used in bioassay. 
These changes might be increases in the concentration of certain 
substances within the organism—for example the accumulation of 
acetylcholine following administration of an anticholinesterase. A 
considerable weight of evidence is against the supposition that drug 
action consists of a few indivisible physiological events in an organism 
(see Clark 1933). It seems legitimate to regard any of the quantitative 
changes resulting from drug action as graded responses. Sa: 

5. We can now put forward. a hypothesis that leads to a unification 
of the mathematical treatment of the two types of response; namely, 
that an individual organism responds quantally if an underlying quanti- 
tative change that results from administration of the drug, and that can : 
be regarded as a graded response, reaches a certain level of intensity 
characteristic of that individual organism. Tf the dose of drug is insufficient 
to bring the quantitative change to the critical level, the quantal 
response will not occur. Hence the idea of a tolerance follows im- 
mediately. = 
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6. It should not, however, be assumed that a quantal response in 
an individual organism could necessarily be related to any of the 
quantitative changes resulting from administration of the drug. Differ- 
ent quantal responses might stem from more or less distinct trains of 
events. If the situation for two trains could be represented thus 


Quantitative changes A — Quantal responses A’ 
Dose of drug < 
Quantitative changes B — Quantal responses B’ 
A’ could most usefully be related to A, and B’ to B. A’ could not usefully 
be related to B nor B’ to A unless the correlation between A and B 
happened to be very high. 

7. In the analysis of quantal response data, we generally suppose 
that the tolerance distribution has a frequency function of the form 
Bf(a + Bx), where x measures the amount of the drug on some con- 
venient scale, and a, 6 are unknown constants. The proportion of 
organisms responding to the drug is then 


a+Bzr 
P= I f(t) dt. (1) 
For example, probit analysis (Bliss, 1935; Finney, 1952) arises by taking 
f) = 6) = Qn)? exp (—40/); 
and logit analysis (Berkson 1944, 1953) when 
f() = MO = ¢ cosh’ (3¢) > 


The relative advantages of probits and logits have been discussed at 
length elsewhere, and we do not pursue this question here. Whatever 
the choice of f(t), the observations consist of the numbers of organisms 
which have responded, or failed to respond, at various doses. From 
these data, by assuming that organisms behave independently of one 
another, and applying some principle such as maximum likelihood or 
minimum x’, we arrive at estimates of a and B. 

8. On the other hand, when we are concerned with graded responses, 
each organism exhibits a response y, which is measured on a continuous 
scale, and thus provides more information than the elementary dead-or- 
alive or other binary classification of quantal response. For each value 
of x, the responses are distributed among the population of organisms 
with a frequency function which can be converted to the normal, logistic, 
or any other form, by suitably choosing the scale of y. In particular, if 
the distribution is normal, with variance independent of x, and the 
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factors affecting response enter linearly, the way is then open for the 
application of analysis of variance techniques. 

9. In order to connect these two forms of analysis we postulate for 
every organism the existence of a critical graded response c, which is 
such that the quantal response occurs if the graded response exceeds c, 
but not otherwise. When z is given, the simplest possibility is that in 
which c has the same value «x for each organism, but more generally y 
and ¢ will have some bivariate distribution among the population of 
organisms. To show how the tolerance distribution is derived from the 
graded response distribution, we consider two special cases. 

(i) The critical graded response is a constant «; and the distribution 
of graded responses has mean 6 + wx and frequency function f{(y — 6 — 
yx)/o}/o where f is symmetrical about zero. If f is normal, o is the 
standard deviation; if f is logistic, o is (8.D.)W3/r. Then 


pa ffeataw) taf" pou @ 
giving 
¥ (v2 +60- 5) 


for the frequency function of the tolerance distribution. This formula- 
tion of P agrees with (1) provided that 


a=(@6—x)/o and B= y/c. (3) 


(ii) The quantities y and c¢ are jointly distributed with means 
@ + yx and « respectively, in such a way that the distribution of 
z = y — chas the frequency function 


1 (2-0-2 + #) 


o o 


where f is symmetrical about zero. Then 


P = Prob.y-e>0 = [ (i244) 4 @ 


o 
which again gives 


o o 


¥ (2 +4- ‘) = 


for the frequency function of the tolerance distribution. Evidently the 
same tolerance distribution can be derived from widely differing as- 
sumptions. : ; 
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10. The interpretation of quantal response data by a tolerance 
distribution can, therefore, if philosophically unsatisfactory, be replaced 
by an interpretation in terms of graded responses. With a constant 
critical graded response, both formulations are equivalent symbolically 
if we replace characteristic tolerance by graded response and dose by 
critical graded response. In most biological problems where graded 
responses are available, it would of course be natural to analyse the 
data without introducing a critical graded response, classifying the 
observations on a quantal basis, and making the calculations much more 
laborious. However, when controlling the quality of mass-produced 
articles, we may either measure a physical dimension or—more quickly 
and cheaply—see whether it exceeds one fixed standard, falls below 
another, or does neither. The amount of information lost by con- 
sidering an experiment only from the quantal viewpoint has been 
examined in this context by Stevens (1948), and had previously been 
investigated in a general context by Pearson (1920). These writers 
show that high efficiencies are possible under certain conditions, but 
—returning to the biological situation—the question of efficiency is 
not the only factor we have to consider, because we lose not only 
information in Fisher’s sense, but any knowledge of the spread of the 
graded response distribution, as is shown above by the way the param- 
eter o is absorbed into a and 8B. 

11. In view of the results stated in (i) and (ii) of section 9, it would 
obviously be interesting to compare experimental estimates for 8 and 
y/o. We sought to compare the value of 6 for a quantal response with 
that of y/o for a graded response to the same drug in the same organism. 
Moreover, for a valid comparison to be possible, the two responses should 
have resulted from the same train of biological events, as explained in 
section 6. Ifin a number of comparisons the value of 6 commonly 
approximated to that of the corresponding y/o, the value of ¥/o would 
have seemed largely to determine that of 8; and this would have indicated 
the theory to be correct. 

Needless to say, pairs of values satisfying the requirements were 
not available. Instead we could only collect sets of values for the 
two quantities (or rather, their reciprocals) for an assortment of bioassay 
methods. Even so, it was difficult to assemble the values from the litera- 
ture on a fair and rational basis. It happens that estimates of 1/8 and 
a/y have been used to compare the sensitivities of different methods of 
bioassay. In consequence Gaddum (1933) and Bliss & Cattell (1943) 
collected a number of estimates from the literature. However, each 
value from an assay method of one type was not matched by a value 
from one of another type that corresponded in the sense explained 
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above. Moreover, the samples may be biased: high values for the two 
quantities may not have been recorded in the literature, because high 
values indicate that the experimental methods by which they were 
obtained are inefficient for bioassay. For the want of better samples, 
we have based table 1 on their data, and it shows the frequencies with 
which the different values of 1/8 and o/y fell within consecutive ranges. 
In order to obtain a more homogeneous sample, Table 1 includes values 
only from assay methods that employed vertebrate material. 


TABLE I 
The frequencies of different values of estimates of 1/8 for quantal responses and o/y 
for graded responses,* determined from a sample of bioassay methods for a variety of 
drugs, employing vertebrate material. 


Number of values 
- Range of 
1/8 or o/y 1/8 (quantal o/wW (graded 
responses) responses) 
0 - 0.05 2 2 
0.05- 0.1 ile 6 
0.1 - 0.2 23 13 
0.2 -0.3 4 13 
0.3 -—0.4 4 8 
0.4 - 0.5 3 3 
0.5 - 0.6 1 0 
0.6 - 0.7 1 0 
0.7 - 0.8 2 0 
0.8 - 0.9 0 0 
0.9 - 1.0 1 0 
Total 52 45 


*Values taken from Gaddum (1933) and Bliss & Cattell (1943). 


For the reasons given, it would be idle to apply a x’ or other test to 
compare these two frequency distributions. It is sufficient to notice 
that there are obvious similarities with respect to both position and 
spread. Thus, although the evidence is very indirect and uncertain, 


there is enough agreement between theory and the results of a hetero-_ 


geneous collection of experiments to suggest that the value of y/o for 
an underlying graded response may often be an important factor in 
determining the value of 6 for a quantal response. In order, 'to test the 
theory adequately, however, data would Bea have to be obtained 


especially for the purpose. 
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SUMMARY 


The events following the administration of a drug are so viewed 
that the probable interrelationship of the resultant graded and quantal 
responses can be formulated quantitatively. Experimental data 
suitable for testing the predictions from the theory were not found in 
the literature, but such as are relevant lend slight support. To obtain 
suitable data, special experiments would probably need to be carried out. 
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ONE LIKELIHOOD ADJUSTMENT MAY BE INADEQUATE 
H. W. Norton 


University of Illinois 


Fisher (1925) says ‘‘. . . since the equations of maximum likelihood 
do not always lend themselves to direct solution, it is of importance 
that, starting with an inefficient estimate, we can, by a single process 
of approximation, obtain an efficient estimate .... It is sufficient for 
our purpose that the error of estimation is of the order n™*” and “.. . 
starting with an inefficient statistic, a single process of approximation 
will in ordinary cases give an efficient statistic differing from the 
maximum likelihood solution, by a quantity which with increasing 
samples decreases as ’’. The problem is more complicated than Fisher 
made it appear. This paper uses a simple example to show that, even 

‘in ordinary cases’, one likelihood adjustment is sometimes es 
inadequate. 

Fisher (1950, chapter 9) takes the genetical problem of the frequency 
of crossing over to illustrate the problem of statistical estimation. He 
indicates five different statistics which might be adopted as solutions 
of the problem, each being consistent and each having sampling variance _ 
inversely proportional to the size of the sample. The first and second 
of these five estimates are inefficient, and might be used as starting 
points for the adjustment Fisher proposed. 

The first of these estimates and its successive improved values 
appear in Table 1. The first entry in the second column is the value 


79 


80 BIOMETRICS, MARCH 1956 


TABLE 1. 


Application of the method of maximum likelihood to improve an inefficient statistic. 


Order of Improvement Estimate Adjustment —0°L/a? 
0 0.057046 —0.031419 12341 
1 0.025627 0.007374 51118 
2 0.033001 0.002522 31802 
3 0.035523 0.000189 27787 
4 0.035712 0.0000003 27520 


given by Fisher for the inefficient statistic; the last is the maximum 


likelihood estimate, also given by Fisher. The fourth column gives. 


the negative of the second derivative of the logarithm of the likelihood 
function. From the second derivative, the sampling error of the maxi- 
mum likelihood estimate is found to be 0.006028, so that the once- 
improved estimate, 0.025627, differs from the likelihood estimate by 
over 1.67 times the sampling error. Hence the once-improved estimate 
must be considered inadequate. Furthermore, the twice-improved 
estimate is afflicted with an error of estimation of nearly half a sampling 
error and would usually be thought unsatisfactory. 

A separate consideration of substantial importance is that several 
iterations are necessary before the second derivative reaches good 
agreement with its value at the maximum of the likelihood. In par- 
ticular, Table 1 shows that, if the second derivative were not recalculated 
after the first adjustment, the estimated sampling variance would be 
over twice the sampling variance of the maximum likelihood estimate. 
On the other hand, if the second derivative is recalculated after the 
first adjustment, the estimated sampling variance would be barely 
half as large as that of the likelihood estimate. Misestimation of the 
sampling variance is a serious error, just as it usually is a serious error 
to use an inefficient statistic. Furthermore, it may be worthwhile to 
recalculate the second derivative so as to speed convergence to the 
maximum likelihood estimate. 

The process of approximation proposed by Fisher is simply Newton’s 
method, and so is of wide generality. However, the example shows 
that it is too much to expect that one adjustment of an inefficient 
estimate will result in reasonable agreement with the maximum likeli- 
hood estimate. In fact, one adjustment will be adequate only when 
the value of the second derivative of the logarithm of the likelihood, 
evaluated at the inefficient estimate, differs little from its average 
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value in the interval from the inefficient estimate to the maximum 
likelihood estimate. This will usually be true only when the inefficient 
estimate is already close to the maximum likelihood estimate, that is, 
when its efficiency is fairly high. 

In the example, the efficiency of the inefficient estimate is only 
about 14%, and it is over 3.5 sampling errors away from the maximum 
likelihood estimate. Also, the second derivative changes rapidly, and 
its value of —12341, at the inefficient estimate, is a poor approximation 
to —18175, its average value over the interval from the inefficient 
estimate to the maximum likelihood estimate. Hence the first adjust- 
ment is nearly half again as large as it should be. 

These considerations apply a fortiori? to simultaneous estimation 
of two or more quantities. It appears that the only safe rule, in cases 
which have not been thoroughly investigated mathematically, is to 
repeat the iterative process until the AES TAS are small and the 
covariance matrix is stable. | oe 2 
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QUERIES 


Grorce W. Snepecor, Editor 


QUERY: In field experiments with plant spacing, there will be 
119. different numbers of plants per plot for the various spacings, if 

plot size be constant. We have an experiment with maize, a 4° 
factorial confounded in 4 randomized blocks. The factors are levels of 
nitrogen, phosphorus and spacing. We have run into difficulties in 
attempting to analyze effects of spacing upon proportion-of-fruitful- 
plants, number-of-cobs-per-fruitful-plant, weight-of-grain-per-fruitful- 
plant. Inspection of the data reveals that variation in fruitful plant 
number tends to be related to the mean, directly in the case of spacings, 
and inversely in the case of levels of phosphate. How may the data best 
be handled? Would covariance on number of plants per plot be ap- 
propriate? If so, what about heterogeneous error regressions? 


Since number of plants per plot is purposely varied by the 
ANSWER: different spacing treatments, the only reasonable value 

to examine in assessing the influence of these treatments 
on plant fruiting is the proportion of fruitful plants and not the actual 
count of fruitful plants per plot. The straight analysis of variance of 
this variable is first examined since the F values are not seriously 
affected by heterogeneous errors. (See Biometrics 3:1-52) 


Mean Squares 


Source df. 
% fruitful plants Degrees 

Blocks 3 118 214 
Nitrogen 3 25 anys 

Linear “a 1 49 280* 

Quad. i! 15 65 
Phosphorus 3 85* 122* 

Linear i 143* 140 

Quad. 1 111* 224* 
Spacing 3 119* 240** 

Linear 1 334** 685** 

Quad. i 16 35 
IN es 8 Lie 36 
N XS 8 20 31 
Pes 8 31 40 
Error 27 26 42 


The analysis of percentages indicates a linear and quadratic effect 
of phosphorus and reference to the means below shows that the effect 
is a positive one which levels off at the higher rates. Spacing has a 
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strong linear effect. No interactions are important. Following are 
the means for the three factors. The range within each treatment is 
given in parentheses as a rough guide to the relative variation. 


Nitrogen Phosphorus Spacing 
Level 
% Degrees % Degrees % Degrees 
1 93.2 (17.5) 76.3 90.8 (80) 74.4 90.7 (20) 73.0 
2 93.7 (20.0) 76.6 94.6 (18) t9.2 93.0 (30) 77.6 
3 93.4 (30.0) 1.8 96.2 (9) 80.9 96.1 (10) 80.3 
4 95.9 (22.5) 82.1 95.0 (20) LOe2 97.0 (15) 81.9 


It is common experience to find heterogeneous variances among 
means of percentage data if some of the means are near the limit and 
others are near the middle of the range. In this example the highest 
spacing mean is 97.0% and the lowest is 90.7%. A Bartlett test of the 
heterogeneity of variances within the spacing treatments gives x’ = 
14.5** (d.f. = 3). The pattern of heterogeneity may be seen from the 
ranges in the above table. 

The usual correction for this difficulty is to use the angular trans- 
formation. When this is done, the analysis of variance changes to that 
shown in the last column of the analysis table. These results are a 
little unusual for this type of transformation. The effect of nitrogen 
was only mildly suggestive before, but now a linear effect is significant. 
The phosphorus effect is more strongly quadratic. The spacing effect 
and interactions are relatively unchanged. The heterogeneity of 
variances within spacings now gives x’ = 4.3 which is non-significant. 

These results suggest that the transformation be examined a little 
more closely. The spacing treatments involved a considerable difference 
in numbers of plants per plot, with a consequent possibility of an effect 
on the binomial portion of the variance. However, there seemed to be 
a counterbalancing drift in the means as shown below: 


Variance within spacing treatments 


Spacing in row Avg. No, 9 |———_—__— — 
Plants/Plot % Degrees Expected 
Binomial  __ 
ie 109 Bile 35.2 ott 
15” 74 64.4 92.2 8.8 
20” 58 9.5 40.4 6.5 
25 - 46 17.8 56.4 6.3 
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The above table shows that the binomial portion of the variance is 
relatively constant, and furthermore, it constitutes only a quarter of 
the total error (= 26). In view of this, there is reason to question the 
appropriateness of the transformation used. If the sole criterion of the 
success of a transformation is the equalization of the variances, this 
one is satisfactory, even though it was suggested under false pretenses. 
The other criteria that need to be given consideration are discussed by 
Bartlett. (Biometrics 3:39). They seem to be as well (or better) satisfied 
by the transformed data as they were on the original percentage scale. 
The association of the means‘and the variances which was observed by 
the investigator is largely due to two or three extreme values which 
fall at different places in the scale of the different factors studied, and 
therefore cannot be corrected by transformation for all factors. It is 
possible that the investigator can find from his field notes explanations 
for these extreme values and can either learn more about the observed 
responses or justify discarding them from the analysis. There does not 
appear to be any further help to suggest from the data themselves. 

The response to plant spacing can be regarded as a continuum over 
the range of interest. If specific points along this range are fixed experi- 
mentally and their effect on a growth variable measured, experimental 
control must be precise to be interpreted. Covariance analysis makes 
the tacit assumption that the slope of the response surface is the same 
for all treatments. This obviously cannot be so in this experiment. 
Furthermore, irregularities in stand give different patterns of effect 
depending on whether they represent single missing plants or multiple 
skips. This is bad enough in experiments studying other treatments, 
but in spacing studies it is most difficult. Therefore, it is suggested 
that no covariance be attempted. 

J. A. RIGNEY 


QUERY: [I have recently been trying to overcome a problem 

120 occurring in fertility records of a group of Merino sheep. 
Fertility (lambs born per year) in Merinos is usually restricted 
to 3 classes—no lambs, one lamb and occasionally twin (2) lambs. 
My problem is to analyse, for purposes of estimating heritability and 
genetic correlations, the records of say 300 ewes with fertility records 
in six years to obtain an estimate of the repeatability of fertility between 
years. The binomial distribution of these data, however, precludes the 

use of analysis of variance and intra class correlation. 

My aim in estimating repeatability is to express records obtained 
for less than 6 years on a comparable basis with the full 6 year records 
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(by means of Legates and Lush, J. Dairy Sci., 1954, P.°744—most- 
probable-(re) producing-ability). 

I would be grateful if you could suggest a solution of this problem 
or an alternative method of utilising all records regardless of their 
being for a period of less than full term. 


You might try using the analysis of variance. It is not 
ANSWER: precluded as completely as you imply. Probably your 

distribution isn’t purely binomial, else you wouldn’t 
have any repeatability. It is more likely to be approximately binomial 
for each ewe by herself but around a probability of twinning which 
varies from ewe to ewe. That is, the variance within ewes may be 
purely binomial but that between ewes contains an additional element. 
What you really want is the size of that additional element as compared 
with the variance within ewes. Number of lambs at a birth can be 
considered a continuously distributed characteristic which, for ana- 
tomical or mechanical reasons, is limited to three classes (four if triplets 
occur also) in its expression. This is to say that the difficulty is not 
in any fundamental difference between binomial and continuous 
distributions but in the coarseness of grouping. Where the variance 
within ewes is such a large fraction of the total as it will be here, the 
correlation between mean and variance won’t interfere much with the 
analysis of variance, although it is part of the problem. With the 
grouping this coarse and the classes so few, no transformation of scale 
is likely to help. If you wish to pursue this line of thought further, you 
might consult an article by W. G. Cochran in 1943 in the Journal of 
the American Statistical Association, 38:287-301. The title of the 
article is ‘‘Analysis of Variance for Percentages Based on Unequal 
Numbers”. 

Another method of attack, which is simple and easy to explain, is 
merely to compute the regression of the average number of lambs at 
future lambings on the number of lambs at the first lambing, or at the 
second, or at any other one lambing. This regression 7s the repeatability 
coefficient you wish. That this is so is shown in the following equation, 
which would be perfectly valid for continuously distributed character- 
istics and is approximately so, even for such coarse grouping as this. 

Let Y = average number of lambs at future lambings 


X = number of lambs at first lambing (or at some other one 
lambing). 
= repeatability of number of lambs from one lambing to another 


— 


~~ 
| 


3 
| 


= number of future lambings averaged in Y 
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The averaging in Y affects rxy in such a way as to cancel exactly its 
effect on cy . Therefore, n disappears from the regression of Y on X 
(although it would not disappear from the regression of X on Y, and it 
does affect the sampling error of byx). For complete accuracy the 
preceding formula requires (1) that ox be the same for all lambings, 
whether 1st, 2nd, or nth, and (2) that t be the same between X, and X, 
as it is between X, and X, , X, and X,, , etc. Minor variations in cx 
or in rxy,x, won’t matter much, although perhaps ox, is enough smaller 
than the other cx’s that this should be examined. In working your data 
in this way, you would, for example, merely sort all of the ewes you 
have on the number of lambs at their first lambing. Then you would 
compute the actual average number of lambs at all future lambings for 
those who had no lamb the first time, for those who had one lamb the 
first time, and for those who had two lambs the first time. (If age of 
ewe affects the average number of lambs born, you would need to make 
allowance for that in the averages. I would suppose this unimportantly 
small, except in the difference between first lambing and other lambings). 
The repeatability you wish would be the future difference between the 
zeros and the ones, or the future difference between the ones and the 
twos. This raises at once (and incidentally provides the means for 
answering) the question of whether the difference between zero lambs 
and one lamb is the same sort of thing as the difference between one 
lamb and two. It is readily imaginable that the cases of zero lambs 
might represent a different kind of a phenomenon or, at least, be much 
farther away (or closer) on the scale of fertility from the single lambs 
than the single lambs are from the twins. This you examine by seeing 
whether the difference in Y between the zeros and the singles is the 
same as the difference in Y between those who have singles and those 
who have twins. That question, I think, you have to answer (at least 
to your own satisfaction) fairly early in the study. 

This would be a study of repeatability of the first lambing, or the 
value of the first lambing as an indicator of all the future lambings. The 
correlations of future lambings with each other are not involved, except 
as they affect the ratio between cy and ox . It is readily imaginable 
that the first lambing might in some way be different from later lambings 
so that the later ones might be correlated more or less closely with 
each other than they are with the first lambing. For a complete study, 
you would want to investigate that, too, and combine with this earlier 
estimate whatever information the data contain concerning the re- 
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peatability of lambings subsequent to the first. You could do this 
by treating each lambing order in turn just as you treated the first one. 

This raises some questions (minor in this case, I am sure, because 
repeatability will be low) about the best way to pool all this information. 
If you go about it by studying in succession, the regression of the future 
on the first, the future on the second, the future on the third, etc., and 
then pooling all these, you have combined all the information from the 
different ewes by counting each ewe k — 1 times where k is the number of 
lambing seasons she was present. This gives a tiny fraction more 
emphasis to ewes with many lambing seasons than they should have, 
although the undue part of this extra weighing of them is extremely 
tiny where the repeatability is as low as this will be. I see no reason 
why a little extra weighing of those with many lambings should con- 
sistently tend to tilt b upward (or consistently downward). The smaller 
variation at the first lambing might make a difference of one lamb at 
the first lambing mean more than a difference of one lamb at the second 
or third lambing. 

If the size of ¢ is known, or you are willing to postulate it, the re- 
gression of number at future lambings on average number at the first 
two lambings, or the first m lambings, can be computed by the following 
equation. Let W be the ewe’s average in her first m and Y be the ewe’s 
average in her next n lambings. 


tt 


lies | 
1+ (m— Dt 


For rigorous exactness this also requires that ox, = ox, = ox, and that . 
fx.x, = xix, = *** = ?x,x, but minor variations from that will have 
scarcely any effect. Only the possibilities that ox, is distinctly smaller 
than the standard deviations at later ages and that zero lamb is physio- 
logically a different kind of phenomenon not on the same scale as singles 
and twins, need concern you here, I think. You can test this formula 
by sorting your ewes on their W values and computing directly the 
regression of Y on W. If the directly observed regression agrees fairly 

well with the formula involving ¢ and m, I would think you could 
proceed with rather high confidence. The direct determination of the 
regression would convince some of your readers who might be mystified 
by the general formula. 
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If the coarseness of grouping introduces only random errors, it makes 
the observed ¢ lower than would be found if the characteristic could be 
measured on a continuous scale. This tendency follows Shewhart’s 
(1926) formula: 


a xX 
USexid, = Updo ai 
x, 


where the ¢ subscript indicates the true value and the o subscript indicates 
the observed value, the latter being the true value, plus or minus a 
random error of observation. Since W won’t be grouped as coarsely as 
X, this probably will make your directly observed regression of Y on W 
slightly higher than the one computed from m and the ¢ value observed 
in the regression of Y on X. Other than taking this into account, I see 
nothing you can actually do about the coarseness of grouping, since 
the number at a birth is necessarily discrete. 


J. L. Lusa 


ABSTRACTS 


Meeting of The Biometric Society, French Region, December 7, 1955 


) 


P. CAZAMIAN ET J. L. SOULE. Mise en Evidence par la 
364 Méthode Statistique de Divers Facteurs Susceptibles de Modifier 
les Résultats de la Spirographie Chez le Mineur. 


L’exploitation par la méthode statistique des résultats de 1355 
explorations spirographiques effectuées en 1951-52-53 au Centre 
Médical d’Etudes des Houilléres des Cévennes, sur des mineurs ou 
anciens mineurs, le plus souvent A l’occasion d’expertises en silicose, 
a permis de dégager certains faits intéressants, parmi lesquels: 

1°) Une liaison réelle mais faible entre l’image radiologique et 
l’amputation de la fonction respiratoire (la petitesse de la corrélation 
explique quantitativement les contradictions entre les conclusions des 
divers auteurs selon le nombre de cas étudiés). 

2°) L’étude de la régression en fonction de l’Age des caractéristiques 
respiratoires a fait apparaitre des anomalies statistiquement signifi- 
catives qui ont disparu aprés scission de la population en mineurs 
actuels et anciens mineurs. Ceux-ci, lorsqu’ils ne sont pas silicotiques, 
ont des caractéristiques respiratoires trés significativement supérieures 
& ceux-la. On a pu préciser que ceci Goncernait surtout les mineurs 
ayant récemment quitté le fond (moins d’un mois). 


5 D. BARGETON, P. DESJOURS ET F. GIRARD. Etude des 
36 Fluctuations Spontanées de la Ventilation Pulmonaire au Repos. 


Chez un animal intact éveillé au repos, on observe des fluctuations 
importantes spontanées de la ventilation moyenne pendant une minute 
V, et de la fréquence respiratoire f pendant le méme intervalle. I] 
existe entre elles une relation linéaire (1) V, = a + Df le coefficient de 
corrélation atteignant ou dépassant 0,95. Les parametres a et b ont 
des valeurs stables et de l’ordre de grandeur respectivement de la 
ventilation alvéolaire et de l’espace nuisible. 

La relation (1) s’interpréte comme exprimant le fonctionnement 
de la régulation chimique et |’écart-type sur la régression caractérise 
la sensibilité de cette régulation. a 

Une relation analogue se retrouve 4 Vintérieur d’un mouvement 
respiratoire (2) si l’on compare son volume V, 4 sa durée P, V,=b+taP 
le temps perdu étant inférieur 4 la durée d’un mouvement. 

Cette liaison est assurée par le contrdle proprioceptif, elle disparait — 
sous narcose et par exclusion fonctionnelle des nerfs vagues. Les 
changements de forme des mouvements respiratoires sont assujettis 
& une liaison stochastique qui tend 4 limiter les variations de la ventila- 
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German Region. A meeting in Giessen, Germany, on July 23, 
under the joint sponsorship of the German Region and the Mathematical 
Institute of the Justus-Liebig-Hochschule, offered the following program: 
Wilhelm Ludwig (Heidelberg), Problem of the optimum in biomathe- 
matics, and Harold Hotelling (University of North Carolina), General- 
ized analysis of variance for two or more dimensions of each individual. 

Japan. At a joint meeting at Kyoto University on October 19 
of the Biometric Society and. the Research Association of Statistical 
Sciences, the following papers were presented: T. Seguchi, On the 
estimation of birth and death rates; K. Ito, On a test for the multi- 
variate Behrens-Fisher problem; K. Saito, On sampling on successive 
occasions; M. Ogawara, Stochastic prediction of earthquakes; K. 
Sakai, 8. Shiraska and T. Okuno, Statistical analysis of an individual 
competition test on sweet potato; H. Inamura and H. Ohata, Statistical 
method in the breeding of barley; and M. Masuyama, Elementary 
method of construction of orthogonal arrays by IBM-602a and by 
hand-sorted punched cards. 

Région Frangaise. Lors de la réunion de la Société Frangaise de 
Biométrie, qui eut lieu mercredi 7 Décembre & |’Ecole Normale Su- 
périeure & Paris, Messieurs Bargeton, Dejours et Girard discutérent 
“Etude des fluctuations spontanées de la ventilation pulmonaire au 
repos’, et Docteur P. Cazamian et J. L. Soule ‘“‘Mise en évidence par 
la méthode statistique de divers facteurs susceptibles de modifier les 
résultats de la spirographie chez le mineur’’. 

British Region. ‘The annual meeting of the Region was held at 
the Wellcome Research Institute in London on December 12. The 
following regional officers were elected for 1956: President, D. J. 
Finney; Treasurer, A. R. G. Owen; Secretary, E. C. Fieller; Committee 
members for 1956-58, Sir Ronald Fisher, F. Yates. After the annual 
meeting the following papers were read and discussed: J. G. Skellam, 
A kinetic theory of transects, and Mrs. M. E. Wallace, The use of 
affinity data in chromosome mapping. 

Netherlands. Members of the three biometrical clubs in the 
Netherlands met on December 22 in Utrecht at Hotel Smits, where 
papers were read by C. Postma on principles of multifactorial analysis, 
by G. de Leve on its statistical aspects, and by G. Hamming on re- 
gression and factor analysis. Through lack of funds, it has been 
necessary to discontinue a small periodical in Dutch called “Biometric 
Contacts”, which had been published for two years by the combined 
biometrical clubs. 
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ENAR. The Region met jointly with the American Statistical 
Association and the Institute of Mathematical Statistics in New 
York City on December 27-29, with a program of seven scientific 
sessions and the annual meeting. At the sessions, 177 members regis- 
tered. The opening session on December 27 concerned Probability 
and Statistics in Genetics, with papers by M. Kimura, Some problems 
of stochastic processes in seine H. Levene, Estimation of parameters 
in genetic models; and N. E. Morton, Sequential tests for detection 
of hnkage in man. On Detenber 28, the first session, on Bioassay, 
opened with two papers on the Precision of microbial assays for vitamin 
B,,. , the first by W. Weiss, H. Edelson and H. W. Loy and the second 
by C. I. Bliss. 8. R. Ames then reported on A slope-ratio liver-storage 
bioassay for vitamin A, and W. R. Bryan on The assay of Rous sarcoma 
virus by tumor response in chickens. The next session concerned 
Statistical Studies of Accident Proneness and the Contagion of Acci- 
dents, the first being a general review by J. Neyman, the second a 
discussion of asymptotic tests and power of tests by C. H. Kraft, and 
the third a limit theorem on related conditional distributions by G. P. 
Steck. An afternoon session on the Interpretation of Genetic Data 
offered papers by A. Kimball, Approximate confidence intervals for 
specific locus mutation rates; T. W. Horner and C. R. Weber, Theo- 
retical and experimental study of selfed populations; O. Kempthorne, 
Epistacy under selfing; and D. 8. Robson, Application of the K, 
statistics to genetic variance component analysis. 

The program on December 29 opened with a session on Subjective 
Testing with the following papers: C. I. Bliss and M. Greenwood, 
A rankit analysis for paired comparisons in taste testing; J. W. Hopkins 
and N. T. Gridgeman, Some stimulus response relations in pair-ranking 
taste experiments; E. F. Murphy, Problems needing answers; and G. E. 
Ferris, Three useful designs in taste testing. A noon program on 
Statistics in Medical Experimentation listed papers by D. Blackwell 
and J. L. Hodges, Elimination of selection bias in medical experimen- 
tation; T. S. Ferguson, Estimation of bacterial densities; and A. Berger, 
On comparing survival rates. A session of contributed papers had the 
following program: H. W. Norton, One likelihood adjustment may be 
inadequate; D. R. Cox, Some general remarks on quick tests of signifi- 
cance; A. E. Sarhan, The teaching of statistics in Egypt; and M. C. 
Sheps and P. L. Munson, The use of chick comb biological assay in the 
study of urinary androgens. At the closing annual meeting, attended 
by 24 members, the Region reelected D. B. Duncan as President and 
A. M. Dutton as Secretary-Treasurer, and named EH. J. deBeer and 
W. J. Youden members of the Regional Committee for 1956-58. 
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Bowl presented to Miss Cox. A small informal breakfast meeting 
was arranged at the Hotel Biltmore in New York City on December 
29, attended by the following officers of the Society, W. G. Cochran, 
C. I. Bliss, D. B. Duncan, W. J. Youden, J. W. Hopkins, and Gertrude 
M. Cox. On behalf of the Society as a small token of its esteem, President 
Cochran presented Professor Cox with a Revere silver bowl which 
carried the inscription: ‘‘Presented to Gertrude M. Cox by the Bio- 
metric Society in grateful appreciation of her outstanding services as 
the first Editor of BIOMETRICS”. 

General Officers for 1956. The Council of the Society has elected 
the following officers for 1956: President, E. A. Cornish, CSIRO, 
University of Adelaide, Australia; Secretary, M. J. R. Healy, Roth- 
amsted Experimental Station, England; and Treasurer, C. I. Bliss, 
The Connecticut Agricultural Experiment Station and Yale University, 
USA. It is hoped to complete the transfer of the Secretary’s office 
from New Haven to Harpenden during the spring. 

The following were elected to Council for 1956-58: F. J. Anscombe, 
C. Barigozzi, W. G. Cochran, A. Groszmann, L. Martin, C. R. Rao and 
HE. J. Williams. 


THE VARENNA SEMINAR IN BIOMETRY 


From a report by L. L. Cavatur-Srorza 


Most universities in continental Europe offer little or no instruction 
in biometric methods for biologists (sensu latissimo). To meet the 
growing demand for such instruction among researchers through a 
brief extra-university course, an International Seminar on Biometric 
Methods was organized by the Italian Region of the Biometric Society 
under the auspices of the [UBS and UNESCO. It was held at Varenna, 
Italy, on Lake Como, on September 7-23, 1955. In view of its novelty, 
at least in Europe, and of the assistance that it may give in planning 
future similar undertakings, the report of the Director of the Seminar, 
Dr. L. L. Cavalli-Sforza, is summarized here for a wider audience. 

Organization. Two principles were followed in organizing the 
Seminar: (1) to canduct the basic courses and as many others as 
possible in the local language, Italian, and (2) to have as international 
a teaching body as possible within this limitation, in order to maintain 
the present international unity in biometrical thought and methodology. 
Contacts with prospective teachers were started a year in advance but 
detailed syllabi were not discussed until February to June, 1955. The 
site chosen, Villa Monastero at Varenna, has in recent years become a 
favorite resort for international summer courses and symposia. It 
offers complete isolation, a beautiful garden, a lecture hall, a smaller 
room for practical exercises, and living accommodations for the teaching 
staff. 

Announcements of the Seminar were sent in March to all university 
departments that might be interested, to governmental and other 
research institutes, to Italian members of the Biometric Society, and 
to European secretaries of the Society. Applications numbered nearly 
100. Although an enrollment of 25 to 30 students was planned originally, 
this number was doubled by having the students work in pairs at the 
calculators. Forty men and 16 women attended, all but one of them 
Italian residents, the other a Swiss. Forty-one held university posts 
and the others research positions. Although applicants with some 
statistical background were favored, the students varied markedly in 
this regard. As indicated by their degrees, 19 had basic training in 
medicine, 18 in the natural or biological pues 10 in agriculture, — 
and 8 in other fields. a 

The Seminar was financed primarily by a grant of $1000 to the 
Biometric Section of [UBS from UNESCO and by student fees (at $15). 
Approximately $300 was appropriated by the Italian Region of the 
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Biometric Society. The Villa was provided rent free by Ente Villa 
Monastero (Como Province) and the calculators obtained on loan. 

Program. There were 14 working days in the 17 days of the course, 
each organized on a common pattern. To accommodate the students’ 
varying statistical and mathematical backgrounds, the following four 
general courses with a lecture each morning were offered, although 
most students preferred to attend all courses. (1) Theoretical foun- 
dations, M. P. Geppert (W. G. Kerckhoff Institute, Bad Nauheim). 
Designed for students with some mathematical background, this course 
covered topics such as probability, one and two variable stochastic 
distributions, random sampling for attributes and for measurements, 
and the major tests of significance. (2) Applied statistical methods, 
C. A. B. Smith (University College, London). An introduction with 
no statistical prerequisites, this dealt with the basic statistical procedures, 
including tests of significance, analysis of variance, simple experimental 
design, regression and matrices. (8) Design of sampling surveys and 
experiments, F. J. Anscombe (Cambridge University). For students 
familiar with basic statistical methods, this considered sampling from 
both homogeneous and non-homogeneous populations and its relation 
to experimental designs. (4) Single degrees of freedom, L. L. Cavalli- 
Sforza (Istituto Sieroterapico, Milan). Primarily for beginners, 
this course dealt with individual comparisons in x” analysis and in 
the analysis of variance. The last two lectures in the fourth course 
were given by G. Pompilj on a short-cut substitute for the analysis of 
covariance. An early morning discussion by Dr. Geppert complemented 
her lectures in course (1); each of the other courses provided a daily 
practical exercise. 

In the afternoon, the students were divided for practical work at 
the calculators into two groups, one for agricultural and science students 
from 2 to 4 p. m. and the other for medical students from 4 to 6 p. m. 
In the time free of practicals, students attended daily special lectures. 
For the science group, these concerned Statistical Genetics by M. 
Siniscalco, Biometrical Genetics by R. Scossiroli and P. Dassat, Agri- 
cultural Experimentation by P. V. Sukhatme, and Sampling Problems 
by A. Linder and H. Furgag. The medical group attended lectures on 
the application of statistics to Demography and Hygiene by A. Tizzano, 
to Clinical Work by G. Barbensi and A. Linder, and to Bioassay by 
G. A. Maccacaro. In the daily group discussion from 6 to 7:30 p. m., 
R. Scossiroli and G. Maccacaro reviewed the practical exercises for each 
day and their solution. Sir Ronald Fisher gave an additional short 
course of six lectures for a restricted group on “The logic of inductive 
inference’’. 
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Student evaluation. At the end of the Seminar, students filled in 
unsigned questionnaires designed to obtain information for use in 
organizing future similar seminars. The fifty-five questionnaires which 
were returned may be summarized as follows. 

The students were about equally divided among those receiving 
complete, partial and no financial support from their employers for 
attending the seminar. Twenty considered the number attending 
too large, and the remainder (35) that it was approximately right. 
Most students considered the presence of students from different 
biological fields useful (31) or indifferent (19), only four feeling that this 
was objectionable. There was much less approval, however, for the 
mixture of different statistical and mathematical backgrounds, 26 
considering this objectionable, 19 indifferent and only 9 useful. This 
opinion was not related to the background of the respondent. Con- 
sidering their normal working duties and expenses, 80 percent of the 
students thought that the Seminar was of about the right duration. 

As noted above, the course content was about as concentrated as 
possible, and there was a slight preference (30) for a less concentrated 
course. Nearly all students would have welcomed summaries of the 
lectures. Thirty-three considered the time spent at the calculator 
about right, 17 as too short, and only 4 as too long. Twice as many 
found working at the calculators in pairs useful as found it troublesome. 
Guided exercises showing all the steps were preferred by 80 percent 
of the students to those giving very few of the steps, although some 
students preferred a gradual shift from the first to the second type. 
More students preferred to have only the final results of the practicals 
given on the exercise sheet (26), as compared with no answers (15) or 
both intermediate and final results (14). During the practicals, one 
or two students with a better-than-average background knowledge 
worked the problems a day in advance and assisted the other students. 
Eighty percent considered this help sufficient, those who found it 
insufficient having less background knowledge than the others. Although - 
a 10-page glossary of definitions and formulas was distributed at the 
beginning, most students never referred to it. There was general 
agreement that organized discussions should have more time. 

Four questions dealt with content. Of the 16 requests for additional 
topics, six were considered as well-grounded; and of the six statements _ 
listing subjects deemed useless, most referred to details. Of the special 
courses, that on bioassay was the most popular, those given in English 
or French had little attendance. Most suggestions for additional special 
courses were too specialized to be feasible. 

In lieu of examinations, which proved too Ge to attempt, 
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students were asked to score each practical exercise on a scale from 
0 to 2 for their knowledge of the underlying statistical method at the 
start and at the end of the course and its potential usefulness. Taken 
at face value, the mean gain in the score suggested that the average 
student almost tripled his initial knowledge. 

Conclusions. Despite the common complaint that “there was 
no time for sedimentation” and the importance for mathematically 
untrained biologists of a slow process of ‘‘digestion”, the present ex- 
perience has convinced the author that short concentrated courses 
in biometry can be useful. There is currently a real hunger for such 
courses and the demand is likely to increase until more universities 
have established regular courses in biometry with sufficient practical 
work. Apart from instruction in the loca] language, an international 
teaching body and a pleasant but isolated locale, the enthusiasm of 
the people attending the Seminar from the beginning to the very end 
contributed most to its success. For future similar programs, the 
following suggestions seem pertinent. 

(1) The amount of work should be decreased to eight hours a day 
for three weeks or less, but the general discussions allowed more time. 
This would mean dropping most of the special lectures, but incorporat- 
ing bioassay in one of the general courses. The time spent on theoretical 
foundations might be reduced, although students with an interest in 
learning why and not just how should be encouraged. 

(2) More stratification of the background knowledge of the student 
should be useful, although few biologists in Italy are prepared for 
courses at a higher level. It is not easy to assess the initial knowledge 
of applicants and those who start from scratch may profit the most 
with elementary teaching. 

(3) Diseussion groups should be smaller, although this increases 
the number of teachers. If conducted by others than the main lecturers, 
a given problem is more likely to be looked at from different angles. 

Other suggestions will be apparent from the responses to the 
questionnaires. In general, the project was considered very much 
worthwhile and we trust that it can be continued in coming years. 


NEWS AND NOTES 


Professor G. W. Snedecor is currently on assignment in Brazil as 
Consultant in Experimental Statistics, to assist in the design and 
analysis of agricultural experiments and to help complete arrangements 
for a Research and Training Center of Statistics in the State of Sao 
Paulo. The project is under the auspices of the Institute of Statistics, 
University of North Carolina, with financial assistance from the Rocke- 
feller Foundation. His headquarters will be in Campinas for five months 
beginning January 9, 1956. 


Summer Sessions at Berkeley, California. 


The 1956 summer program in the Department of Statistics of the 
University of California, Berkeley, California, will consist of two 
sessions: June 18 to July 28, and July 30 to September 8. The faculty 
of the summer sessions will include Professor D. R. Cox of the University 
of North Carolina, Professor Grace E. Bates of Mount Holyoke College, 
and Professor David Blackwell and Mr. T. 8. Ferguson of the Depart- 
ment of Statistics of the University of California. 

The program includes two of the usual undergraduate courses in 
each session, adapted primarily to meet the needs of students trans- 
ferring from other centers who would like to undertake advanced study 
at the University of California during the regular academic year. Also 
a graduate seminar will be conducted by Professor Blackwell. This 
seminar will allow for individual consultation for students working 
toward higher degrees. 


Southern Regional Graduate Summer Session in Statistics. 


A continuing integrated program of graduate summer sessions in 
statistics, to be held in rotation at Virginia Polytechnic Institute, 
Florida University and North Carolina State College was begun in 1954. 
This year’s session will be held at North Carolina State College, Raleigh, 
June 11 through July 20. Eleven courses will be offered in advanced 
calculus, elementary and advanced statistical methods, theory and 
analysis, stochastic processes, sample survey designs, econometric 
methods, linear programming and special problems. Seminars will be — 
held twice weekly, and there will also be special Social Science Research 
Council lectures on linear equations and on production functions. A 
maximum of two courses may be taken for residence credit towards a 
graduate degree at any of the cooperating institutions, as well as at 
some other universities. 
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Summer session faculty will comprise D. B. Duncan, Statistical 
Laboratory, University of Florida; W. L. Smith, Dept. of Statistics, 
University of North Carolina; and C. Harrell, Dept. of Economics, J. 
Levine, Dept. of Mathematics, and R. L. Anderson, Gertrude M. Cox, 
A. L. Finkner, A. H. E. Grandage, R. J. Hader and R. J. Monroe, Dept. 
of Experimental Statistics, North Carolina State College. 
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CONFIDENCE INTERVALS FOR VARIANCE 
RATIOS SPECIFYING GENETIC HERITABILITY* 


FRANKLIN A. GRAYBILL, FRANK Martin, AND Grorce Goprrey** 


Oklahoma Agricultural and Mechanical College 


1. Summary. The purpose of this paper is to present a method for 
setting confidence intervals on the ratio of variances in the twofold 
analysis of variance model. Since the ratio of variance components is 
of some importance in the study of genetic heritability, a portion of the 
paper is couched in genetic terminology. 


2. Introduction. Phenotypic differences between individuals in 
most traits are partly due to differences in heredity and partly due to 
the differences in individuals’ environments. Each developed trait is 
the result of the action of genes, the action of the environment, and the 
interaction of the genes and the environment. Heritability is a quantita- 
tive description of the amount of hereditary variation in a trait. 

It is important for the livestock breeder to know which traits have 
some degree of heritability if he wants to make any permanent improve- 
ment in his livestock. The only permanent changes in livestock quality 
are genetic changes brought about by a breeding program that. will 
bring together favorable gene combinations. 

It is thus important for the breeder to be able to estimate herita- 
bility of a certain trait. One method of estimating heritability i is by 
the technique of the analysis of variance. 

If r sires are each mated at random to s dams and ¢ offspring result 
from each mating, the analysis of variance takes the form as outlined 
in Table 1, where oz , oj , and o¢ are respectively the variance com- 
ponents heron sires, dams, and individuals. mmiies 

Using the variance components from Table 1, there are thes: ways of 


estimating heritability, h’: =e 
= Aor, h2 Ss Ao; h2 = 2oz + 7) 
to to to’ egg Fay toe” eG toto 


*This work was sponsored by the Oklahoma Agricultural Experiment Station Project Number 850. 
*kRespectively Associate Professor of Mathematics, Research Assistant in Ldap ae a? and 


Professor of Poultry. 
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TABLE 1 
Analysis of Variance of Inherited Trait 


Source of Degrees of Mean Expected 

variation freedom square mean square 
Between sires nm =r—1 A; of = 02 + to? + sto? 
Between dams within sires | m2 = r(s — 1) A, oe =o2 + lo; 
Between full siblings m = rs(t — 1) Ay of = 07 


If in Table 1 the expected and observed mean squares are equated, the 
resulting equations can be solved for oz , «7 , and o.. If these values are 
substituted in the formulas above, the point estimates of h” are obtained. 
While point estimates give valuable information, it is also desirable to 
have a confidence interval estimate of h®. Reference [1] presents a 
well known method for setting confidence limits on h’ for a simpler case 
than the one presented here. Reference [2] gives a method for finding 
the approximate standard error of h’ for the case presented above. 

The purpose of this paper is to present a method for setting confidence 
limits on the ratio 


avi ac ee 2(e.. ae 
(1) [Mid cite priate Orr 


for the method of mating explained above. 


3. The approximate distribution of a linear combination of chi-square 
functions. Consider the twofold classification model 


(2) Yin = w+ a; + by +n 


where: til a2 een neg a 1 Doe Ss andik = 1 Dose te This 
assumed that the a, are distributed normally with mean zero and vari- 
ance oz. Similarly the b;; and the c,;, are distributed normally with 
zero Means and variances o; and o, , respectively. It is further assumed 
that all the terms are uncorrelated. The analysis of variance takes the 
form given in Table 1. 
It is well known that 
mA, MA MAs 
le ee mee 
are independently distributed as chi-square with n, , n2 , and ng degrees 
of freedom, respectively. While it is known that the linear combination 


PA, 


ae dn ra a 7 
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of independent chi-square variates is distributed as a chi-square variate 
if and only if the coefficients of the linear combination are equal to 
unity, it seems reasonable to assume that the distribution of a linear 
combination of independent chi-square variates can be reasonably well 
approximated by a chi-square distribution even if the coefficients of 
the linear sum are not all equal to one. 

The method presented in this paper is somewhat patterned after the 
method of attack used by Satterthwaite for testing a hypothesis on 
variance components (reference [3]). That is to say, the method pro- 
posed in this paper consists of equating the moments of 


()* = Sete Haas 


: , Where of = y(o, +03) +0, 
4 
to the moments of the function x{y)/N, where xy) represents a chi- 
square variate with N degrees of freedom. We will determine the con- 
stants, V, a, , a3 , and y, so that the first two moments of Y/N will be 
equal to the first two moments of y{y)/N. We have chosen a; so that o; 
and o; have equal coefficients. The importance of this will be seen later. 
If this linear combination of chi-square variates, Y/N, is closely approxi- 
mated by xix)/N, then the ratio (Y/N)/(A,/o2) will be approximately 
distributed as Snedecor’s F, and this will be used to obtain approximate 
confidence limits on h’. 

The moment generating function of Y/N is 


My,n(6) = (1 — 2B,6)"”(1 — 2B;6)"*”” 


where 


Expanding My,y(9) we get 
My,x(0) = 1 + (n.Bz + 15B;)8 
+ [no(n. + 2)B2 + 2nn,B.B; + ns(nz + 2)B3]6°/2! 
+ [no(m2 + 2) (m2 + 4)B? + 3n2(nz + 2)n,BzBs + 8nzns(nz + 2)B2Bs 


3 


2h (ge Dae PB 18a te b TL Bm, + 2; — 


p=2 i=0 E 


. > (*) TI Bum + 2)) I B,(n3 + 2m) Jota! pelt 


102 BIOMETRICS, JUNE 1956 


The moment generating function of x¢1)/N is (1 — 26/N) *”. Expand- 
ing into an infinite series we get 


N(N a7 2) 0/2) NW ae 2 + 4)6°/3! 


M Se (Oe L- 8 5 
- 1)0 

oe ele i ae oe 

If we equate the first moments of M y/w(6) and M,2,y(6@), we find 
that n.B, + n3B, = 1. Substituting for the B;’s we find that 


(6) a3 + 30% =e 


2 
O4 > e 


it is now possible to determine the values of a, , a; , and y. It follows 
from (6) that y = #, a. = (s — 1)/s, and a; = I/s. 
N will be determined by equating the coefficients of 6’/2! in equations 


(4) and (5), ie., by equating the second moments of Y/N and xiy)/N. 
We obtain 


2 
Le-F N = No(N2 + 2)B3 + 2n.n3B.B; aE N3(N3 + 2)B3 3 
Substituting for the B;’s, we get 


: NoN3o " 
(7) ee 
Ngnos 2+ Ng0t303 


— 


We see that the number of degrees of freedom, N,isa feat of the ; 7 
unknown parameters o; , ¢; , and og. _ 


Ii we let K = o3/(lob + 0°), then Me” ae 


Se wi idltin-L, 
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It has been suggested (reference [3]) that the parameter, K, be 
estimated by the analysis of variance, and the value of N be obtained 
by substituting the estimated value of K into (8). If this procedure 
is followed, the estimated value of K, (see Table I), which is 
(A; — A,)/tsA, , is easily computed. After a value for N has been 
determined, the method explained in section 5 can be used to set con- 
fidence limits on h’. 

While this method for determining N may be satisfactory for many 
purposes, we will present an alternative method which will be applicable 
in many important situations. For this alternative method, we will be 
able to set upper and lower bounds on N. 

Let 


and let f and g be two positive constants such that f < h? < g. After 
substituting D and W into h* and performing some algebraic manipu- 
lations, we arrive at the following inequality for K, the only unknown 
in determining N from (8): 


3 
enw +n rw <*<e—gDo+D +e’ 

If we can show that N in (8) is a monotonic function of K, then we can 
substitute the two bounds of K into (8) and get bounds for N. 

To show that N in (8) is a monotonic function of K we proceed as 
follows: 

Since N is a rational function of K with non-vanishing denominator 
(when r > 1, s > 1, and tK > 0), N is a continuous function of K and 
has continuous derivatives of all orders. Taking 6N/6K we get 


dN 2rs’t(1+tK)[(r—1)(s— 1) +r(1+stK)?]—2r’s*t(r—1) A +tK)?(1+stK) 
6K [((r—1)(s—1)+r(1+stK)’}? 


By straightforward manipulations it can be shown (forr > 1, s > 1, 
t > 1,andtK > 0) that 6N/éK < 0, and thus N is a monotonic function 
of K a the ranges of r, s, and ¢ spocitted above. Therefore the inequality 
N, < N < N, holds where N, and N, are the values of ®) when 


ae ZS WD + 1) + giD 


and when Paes 


: coer - 
i Oe ee: iD 


respectively. 
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If the model (2) represents a genetic model, and if for example in 
a particular problem it seems reasonable to assume thatmis= haauG 
(say), and if the inheritance is additive and autosomal, then D = 1. 
Therefore the inequality 


is obtained. Using these values for K, bounds can be put on N that 
are applicable to the particular problem under investigation. On the 
other hand if the experimenter does not have enough information at 
his disposal to make any particular statement about the bounds on he 
it is known that in genetic models 0 < h? < 1. Also in many important 
genetic cases D > 1. Using these inequalities we get 


1 1 
OS AED TEES £2 
and the relationship 
Ars’(r — 1)(¢ + 1)’ ? rs(r — 1) 
(9) APL A” Sel) kala Oy ee eal 


provides bounds on the quantity N. 


4. The third moment. In this section we will see how the third 
moment of Y/N compares with the third moment of xiv,)/N when N 
is given by (8) and when the inequalities D > 1 and 0 < h? < 1 hold. 
If we let d = C, — C, where C, is the coefficient of ¢°/3! in equation (4), 
and C, is the coefficient of t°/3! in equation (5), we get 


d= No(Ny + 2)(Ne = 4)B3 + 3NyN3(Nz “te 2)B2Bs | 3NgN3(Nz a2 2)B.B3 


+ ns(0s + in + 4B; — NA EI $9) 

Using equations (6) and (7), which were obtained by equating the first 

and second moments of Y/N and x{)/N, we get 
dee Bas0s0203(Msd2o2 — NoOso3) 


22 8 
NgN3F4 


Substituting for the constants and simplifying, we obtain 


8(s — 1) + stK)(1 + rstK)’ 


Ce r(r — 1)’s*(1 + tK)* 


wT 
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If we proceed as we did with the second moment, we can obtain the 
following inequalities which are valid if D > 1 and if 0 < h’ els 


88-1) - gc SH VEF Not + t+ Wrst + t+ 2)" 


sri — Jy — 2r°(r — 1)’s‘(t + 1)* 


(10) 


9 


~ 


<¢-D) 


Thus we see that the first two moments of Y/N and x~y)/N are equal 
when JN is given by (8) and the third moments differ by d where the 
bounds of d are given by (10). 


5. Confidence interval on h®. To obtain an approximate (1 — 2a) 
confidence interval on h’* as defined by (1), we proceed as follows: Y is 
approximately distributed as x{v), 2:A,/oz is distributed as x{,,) , and 
the two are independent. Therefore 

¥, fas 

N o. 
is approximately distributed as Snedecor’s F distribution with N degrees 
of freedom in the numerator and n, degrees of freedom in the denomina- 
tor. Let F, and F, be two values such that 


[- On ip Fp die. 


= 4h 


Using these we get 
where P(f; < q < f2) = V means the probability that f; < ¢ z fr is 
equal to V. Substituting for w and making some manipulations, we get 


2(C — F%) 2 2(C — F,) *5 noth St 
(11) pl Ca cw s A | = 0 a 


where ~ 

(= J)A, + As. 

: sA, 

Thus the quantity in the brackets determines an approximate (1 — 2a) 
‘confidence interval on h’. 


C= 


6. Empirical investigation of accuracy. If the value of Nii in (8) is” 
used, we have shown that the first two moments of Y/N and x{w)/N are 
equal and, if D > 1 and0 < h’ < 1, the third moments are almost equal 
when r is of appreciable size. We aid examine higher moments to see 
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how well they fit. The computations, however, become very cumber- 
some when this course is pursued; therefore, it was decided to use 
empirical sampling to investigate the accuracy of the confidence limits 
given in (11) when N is calculated by using the two bounds in (9). 

One hundred thousand random normal deviates (reference [4]) were 
combined to form the observations Y;;, where 


Vig = Bo Ge + OE Cue; 
(em Wl RPT one hl Gite A peg ith, At feed Wee ghar 7k 


Thus the a; , b;; , and c,;, are independent observations drawn from 

normal populations whose means are zero and whose variances oz , au 

and a” are each equal to unity. The value of » was taken as 15 in order 

to eliminate negative values of Y;,;, (the range of the random normal 

deviates was from —4.417 to +4.417). This gave a value for h’ of 4/3. 
Eight combinations of 7, s, and ¢ were used as follows: 


Total number of 
observations 


3 
nm 
* 


Case 


ONOARwWNH 
_ 
COmnWNOoOuawD, 
PP PB DDD bd 
NNNNNNND 
—_ 
oO 


rs 


For each case, 500 sets of observations were drawn from the random 
normal deviates, 500 analyses of variance were made and from each of 
the 500 analyses of variance confidence limits were computed for h’. 
For example in case 4, which consisted of a total of 40 observations, 
500 sets of 40 observations were drawn, and for each set of 40 an analysis 
of variance was run and confidence limits computed for h” by using the 
formula given in (11). Actually two sets of confidence limits were 
calculated for each analysis of variance, one using the upper bound of 
N (when tK = 0) and the other using the lower bound of N (when 
iK = t/(t + 2), given in (9)). Confidence limits were also placed on o? 
as a check to insure that our sample was not too divergent from what 
was expected. 95% confidence intervals were used throughout, and 
since an exact method of setting confidence limits on o; is available, 
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we would expect them to contain o7 95% of the time. A chi-square 
test was run for each case on the number of times out of 500 that the 
calculated limits did not contain o. . In none of the eight cases was the 
chi-square value significant at the 5% level. Thus there was no reason 
to believe that the samples were ‘bad samples.’ The results of the 
sampling are reported below. 


Percentage of 
Percentage of confidence limits containing confidence limits 
Case as containing o2 = 1 
number | 
| Maximum V Minimum NV 

1 94.8 97.6 95.6 
2 91.6 93.6 94.0 
3 96.2 96.8 94.4 
4 95.8 96.6 95.0 
5 91.6 96.8 94.6 
6 91.0 95.6 95.6 
z 89.6 93.0 93.4 
8 92.6 94.6 96.0 


The accuracy of the confidence limits on h’ is satisfactory for the 
cases cited. In fact, the results were exceptionally good considering 
that such small values of 7, s, and ¢ were used. Since h? = 4/3 is not in 
the range which was used to calculate N in (9), it appears that the 
confidence limits would be even more accurate if a population had been 
selected in which h” was between zero and one. 


7. Numerical example. We will present an example using data 
which were obtained from the Poultry Department of Oklahoma 
A and M College. Varying numbers of dams were mated at random to 
sires and the 12-week body weight of male offspring was recorded. If 
the data on all the offspring had been used, it would have demanded an 
unequal subclass analysis. In order to avoid this, and since the above 
method of setting confidence limits on h’ was discussed only for equal 
subclass numbers, the data were sampled to provide subclass numbers 
which were equal.* The data as finally analyzed consisted of 22 sires 
each mated at random to 6 dams, and the 12-week body weight of 8— 
offspring from each mating was used. An analysis of variance was 
made, and the results are given in Table 2. 


*Other minor adjustments were also made on these data such as filling in some missing items, etc, 
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TABLE 2 


Analysis of Variance of Poultry Weights 


Source of Degrees of Mean Expected 
variation freedom square mean square 
Total 1055 
Between sires 21 0.5629 = A; o2 + to; + sto? 
Between dams 
within sires 110 0.2055 = Az o + to; 
Between full 
siblings 924 0.0924 = A, o 


To calculate confidence limits on 
2c, + 95) 
atotoa. 
by using (11), we must calculate C, Ff; , and F,. By substituting the 
values from the table we get 


C = 2.87. 


v= 


To obtain F, and F, we must first evaluate N and n,. By definition, 
nm, = 924, and N will be calculated by using (8). Since we know that 
0 < h’ < 1, and since, for these data, we are willing to assume that 
D > 1, it follows that NV satisfies the inequality 644 < N < 131. Since 
n, and N are quite large it will make very little difference which of the 
extreme values of N we use. We will use the upper bound to illustrate 
the method and then we will also compute the confidence limits for the 
lower bound of N and compare the two results. If we set 90% confidence 
limits on h’, then a = .05, and F, is the value in Snedecor’s F table which 
corresponds to N degrees of freedom for the numerator (larger mean 
square) and m, degrees of freedom for the denominator (smaller mean 
‘square). This gives F, = 1.23. To obtain F, , let F* be the tabulated 
F value with n, degrees of freedom for the numerator and N degrees of 
freedom for the denominator. Then Ff, = 1/F*. This gives F, = 
1/1.26 = 0.79. 
Upon substituting the values into (11) we get 


eh. S50 


as the 90% confidence limits on h’. If we use the lower bound of N 
instead of the upper, the limits are .26 < h? < .54. 
We could also calculate confidence limits on h’ by estimating K 
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from the analysis of variance and using this to obtain N. We get 0.0363 
for the estimate of K and 102 as the corresponding value of N. This 
gives .28 < h? < .51. 
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A PLAN FOR PROGRAMMING ANALYSIS OF VARIANCE 
FOR GENERAL PURPOSE COMPUTERS 


H. O. Hartiey 


Iowa State College 


1. The necessity for standardization 


The analysis of variance of data arising from an experiment is a 
numerical procedure which is very frequently performed in numerous 
departments at universities and colleges as well as in other research 
centers. In spite of the considerable volume of computation expended 
on this activity most of the work is still carried out on desk computers, 
and this is even true of many centers at which the services of a high 
speed general purpose computer are available. The reason for this is 
undoubtedly the great variety of experimental designs, each of which 
gives rise to a different type of analysis of variance each applied to a 
small body of data. There is no difficulty in setting up and testing 
suitable programs every time data from a new design are ready for 
analysis, but in so far as the time and effort of doing this is usually 
much greater than the effort of completing the analysis of variance on a 
desk computer, there is clearly no point in enlisting the high speed 
machines.* The question, however, arises whether it is not possible to 
so adapt the analyses of variance for the diverse designs that they can 
all be covered by a standard computing program. Such a standard 
program would be set up and tested once and for all, and would then be 
available for the analysis of variance of data from any design, it being 
only necessary to convey minor program instructions pertaining to each 
particular design. Whilst such a standard program is clearly not possible 
for the analysis of any set of data an attempt is here made to at least 
cover the majority of designs arising in experimental work. Because 
of the above reasons standardization of the program is here regarded 
as of prime importance. It is fully realized that when dealing with a 
particular design on a particular machine there may well be alternative 
programs resulting in shorter computing times. A number of such 
particular programs have been published in the past. (See List of 
References.) The basic analysis to which that of other designs will 
here be reduced is that of a general factorial experiment which will be 
discussed in the next section. 


*The comparative clerical labor of preparing the data for input into the respective machines, 
although by no means negligible, is not discussed here as this depends on details of the organization 
of the computing center. 
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2. The program for the k-factor experiment based on a special operator 
calculus 


For convenience we confine ourselves to a k = 3 factor experiment. 

Let 2,;; denote the experimental result from {th level of factor ‘7’, 

7th level of factor ‘I’ and jth level of factor ‘J’. The symbols 7, I and 

J will also denote the number of levels for each factor so that 

é=1,2,---, 7,1 =1,2, --- ,T andj = 1,2,--- , J. The complete 

. analysis of variance of the T X I X J results into its 2? — 1 = 7 com- 
ponents is shown in Table 1. 


TABLE 1 
Analysis of Variance for 3-Factor Experiment 


Component Degrees of freedom 
dk (T — 1) 
, (I —-1) =e ap = 
J : (J — 1) a 
EEA ME eS CT ome iL) ee = ee 
Ce ap (Ti 1) 
Te =F — 1) 


TXIXJ | (T-)E-)VU-1) 


For the corresponding sums of squares we shall require the familiar 
notation for group totals, viz. = . 


eta is Xi = tite and likewise X. ; Xu, ae 


3 10) : t=1 ‘ ; 4 
V = ; ne a ; aa 4 A 4 
2 r "Pot e) See nae is: i } c ; 
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For example if we apply the first two operators to the original set of 
results x,;; we have 


Di(teij;) = Tide; = Xs; ee total for the 7, 7 combination of 
(3) factors I and J 
Diew;) = T ti; — Xi; ee deviate of x,;; from the z, 7 mean 
= T(x,;; — £.;;) (multiplied by T. 


The above simple operations represent the first two lines in the schedule 
of operations shown in Table 2 which gives complete formulas for the 
totals and deviates resulting from the sequence of operations 2,D,Z; 
D;=,;D; applied to the data z,;; . The seven sets of deviates finally 
reached in lines 9 to 15 are finally subjected to the Mean Square Opera- 
tion (_ )? and the results are the ‘Sums of Squares of Deviations’ (all 
. multiplied by TJJ) for the seven components of variance shown in 
the fifth column of Table 2. Table 3 (below) illustrates these operations 


TABLE 2 
Schedule of Operations for Three Factor Analysis of Variance 


Deviates 
Applied to used for 
Line | Oper- | values in Will form totals or deviates analysis of 
ator lines variance 
components 
1 Input Utiz 
2 >a 1 X sj 
33 D; 1 and Py T2353; Xe a7 
4 2h 2 XG 
5 zi 3 Lee Boat 
6 D; 2 and 4 EX Kd 
i D; 3 and 5 Dlteey EO Teg oy 
8 2j 4 Px 
9 zi 5 TX:.. —-X 7 
10 > 6 IX,;, — X I 
11 2j 7 TEX = EXE ea DX. + Xe 1 a I 
12 D; 4and8 | JX..;—X sf 
13 D; 5 and 9 TIX; — JX..3 — TX1.. +X TES God 
14 D; 6and 10 | IJX.,; —JX..; — 1X5, +X MSR af 


15) MDS and aT el Tou I Nag TF Kp TX uae real 
— TIX, + 1X.;. + TX:,, — X 


a A ok Be 
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TABLE 3b 
Analysis of Variance of Data, Table 3a 


Components TIJ (S.o.squares) 
ae 6 
J 78 
df 4 
eer 186 
EA 182 
If yea) 182 
HU SQ IE Se of 46 


with the help of a simple example in which T = 3, J = 3 and J = 2. 
It is these figures, i.e. the number of levels in each factor, that may 
vary from experiment to experiment and need to be conveyed to the 
machine. If the number of factors, k, exceeds 3 the sequence of operators 
> D --- is extended as part of the program. A decision may have to 
be made on the maximum number of factors (k) to be covered by the 
standard program. In this connection it will be observed from Table 2 
that when the operations of lines 1 to 15 are completed, the deviates 
formed in lines 1 to 7 are no longer required; if other operations, cor- 
responding to further factors, require additional storage space, the stores 
containing lines 1 to 7 could be cleared and used for subsequent deviates. 
If this saving of storage is carried to its full consequences the minimum 
storage space required is that for the final set of deviates to which the 
mean square operations (_ )” are applied. To give an example, take a 
rather large experiment of k = 4 factors each at p = 6 levels, ie. an 
experiment consisting of p’ = 6* = 1296 observations. By the time 
the pair of = and D operators have been applied for the last (4th) 
factor the number of deviates to be stored has risen to 


k 

(4) > (Fe =(p+1' = 7 = 2401. 

For the particular case of a 2" experiment in which all factors are at 
2 levels (T = I = J = 2) the operators 2, D, , --- , correspond to the 
well known procedure of taking sums and differences with regard to 
each factor in turn and the deviates formed in the 4th column of Table 2 
are in fact the well known ‘Effects’ (i.e. main effects, 1st order and 2nd 
order interaction effects) of the 2° analysis. The present procedure 


“ af 
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may therefore be regarded as a generalization of this well known 2° 
analysis to any number of factors with any number of levels, and it 
makes provision for a complete computation of sets of ‘effect values’ in 
terms of sets of deviates. In many cases it will be worthwhile to arrange 
for the machine to print out these deviates as they usually provide a 
better insight into the nature of the data than do the mere sums of 
their squares. In particular this program facilitates studies on the 
fundamental assumptions of Analysis of Variance such as additivity and 
normality as well as simultaneous studies of various variate transforma- 
tions. 

The detailed method of programming the operators =, , D, , «++ , will 
depend on the facilities of the particular machine. 


3. The reduction of the analysis of other designs to the factorial program 


3.1 Analysis by standard factorial program. The k-factor experiment 
as described in the preceding section will result in an analysis of variance 
as shown (for k = 3) in Table 1, and in the general case consisting of 
2* — 1 components of variance. Whilst such designs are not uncommon 
they are certainly not predominant and in the majority of cases the 
design employed will not be of this kind.” For example, the simplest of 
all analysis of variance situations is an experiment in which 7’ experi- 
mental groups are each replicated J times, and z,; denotes the result 
from the 7th replicate in the 7th treatment group. Since the J replicates 
in each treatment group are completely unrelated they do not constitute 
a factor. Nevertheless we may analyze the data wv,; formally as if 
they came from a 2-factor experiment with ‘factors’ T and J and thus 
obtain from the standard computing procedure the sums of squares 
(multiplied by TJ) for the components T, J and T X I. To complete 
the proper analysis of variance for this experiment the program will . 
contain a code for a ‘Summary of Components’ which would be appro- 
priate for this particular design and would read as follows:— 


Summary:— J + (7 X J) = Error (within treatments). 


Again if we have a ‘Split Plot’ design with 7 main treatments and I 
subtreatments arranged in J blocks we can obtain the appropriate 
analysis of variance by performing first the standard factorial ane 
and then adding the summary instruction:— 


Summary:— Error (b) = I X J) + (T XI X J) ip ss 


The general principle of this procedure is therefore to perform first a 
formal factorial analysis and then pool certain components in accordance 
with summary instructions which specifically apply to the particular : 


——— 
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design. This would certainly be a wasteful procedure for a desk com- 
puter but is convenient for a high speed computer. In Table 4 (at end) 
we list examples of such experiments which can be dealt with by this 
method. We give for each design 


(a) The factors which would be associated with the standard factors 
of the factorial analysis, 

(b) The summary instructions for the design, 

(c) The final analysis of variance in terms of the main effects and 
interactions of the factorial analysis. 


It will be seen that a considerable diversity of designs can be covered 
in this manner. ; 


3.2 Analysis involving ‘rearrangements’. Designs not covered by the 
principle of reduction used in the preceding section are those involving 
a higher degree of balance, i.e. a higher degree of restriction in the 
randomization such as the Latin Square, Youden Square, Lattices, 
Incomplete Randomized Blocks and Confounded Designs. In order to 
deal with such situations only one intermediate operation must be 
introduced. This operation, called ‘rearrangement’ will be described 
in terms of the Latin Square below:— Let us denote by 2;;,.:) the record 
for the ‘plot’ in the 7th ‘Row’ and the jth ‘Column’ to which the tth 
treatment has been applied. There are only T” records and these are 
first conveyed to the machine store arranged by the two factors ‘Rows’ 
4=1,2 --- J = T and ‘Columns’ j = 1, 2,+-- , J = 7. Along with 
Xii(1) we convey the treatment code ¢ and this is carried through all 
the D operations, but ignored in the Y operations so that the results of 
the sequence 2, D; 2; D; are the quantities shown below:— 


Operation | Yields values 
Ze, aoe 
(5) zy /D; DX geen 
D2; P.O >? 
ID, D; mt 2p ee ar. TX.; are (ER +X. and codet 


At this stage the latter quantities D; D,2,;,,) will be ‘rearranged’ and 
conveyed to a store arranged by (say) rows, 7, and treatments, t. To 
these quantities u;, = T? a4;) — TX., — TX; + X.. will then be 
applied the operators 2{ and D‘ in which the’ signifies that the operation 
occurs after rearrangement. The results of these operations are as 
follows: 
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: (9) toring 
(SX IX DL) Rea || 
IOLIG Tapio puosag ! 
SXI ang-qng X qng : 
*IeqUul Sx AS qng-qng >< Uley, | : i \ 
(SX IX al Jepi0 . | | 
puosag S XL qug-qng Le 
SXI| ‘t9qur 
PACHA Iap.io (q) toring | \ a LOLI 
IXL) td} IXL ang x uw IX | ‘t0yuy \ 
s i “Lang F “Lang | Pence | 
I+ syooaye . \ f s3x001¢ | 
ZL) Wen | XxX (8) tory CXL (8) 1011 | IX “aequy | 
\ ie PO | Ay eapote ay IZ 108g | 
£ syoorg L “L Urey Pe ei 83 | a Z T Se it 
(IX DX £Xs)+ 
(LX £X §)+ 
(IX £Xs)+( (2) 
(LX 8) } t01.1gy 
(LX IX L)+ 
f Yt suoryoesezur (£X IX Pe (D [xX rX 2) 1 tex 
1l@ jo ung = 1011 (LX 1) frosragy (LX D  froxagy (2X0) Jxoragy | 
\ 
f syorg fe sol 
sg S="Laqne-qng; . r= syoog 
I ; sioyony I= ‘1 qng I= ‘Lng 
a LS “1 wey ZL = 1 area 
syoorq syoorq 
pezimopue.s ur S}UOUT}BeTY qns 7 
yueutiedxe 40] 417ds-41,dg 8} UOUT}VOLY TIBUT 7, | 
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Operation Yields values 
SD .D, iTS XT ee T(TX, — X) 
G\aDID De S32 Gay el Mae TX, = T XG eee 
= Wl ti. lk — 1 xe ee) 


In terms of these operations the final analysis of variance can clearly 
be written as follows: 


Rows (Day eed 

Columns (SeDac=ry 

7) Treatments ; pie TRY a Bie 
nus : 

Error Tr (Di D; D;) 


Here the symbol 7 *(D! D; D;)’ means:— To the z;,; arranged by the 
1st factors (Rows 7 Columns 7) the operators D; and then D; were applied, 
the deviates thereby resulting were rearranged by the 2nd factors 
(Rows z and Treatments ¢) and to these were applied operation D{ and 
the mean square operation ( )’, finally the answer was divided by 
T.* With the help of this additional operation a wider variety of 
experimental designs can be analyzed in terms of the standard factorial 
program. With certain designs it may be necessary to convey a double 
subscript code required for a subsequent rearrangement. Some of 
these designs are set out in Table 5 (below) in which are shown for 
each design the ‘factors’ before and after rearrangement and the final 
analysis of variance formulas in terms of the operators 2, D and >’, D’. 
For certain designs it is convenient to also employ a “Total Sum of 
Squares of Deviations Operation’ (D)’ which is applied to the total 
N=TXIX JX -:: number of observations as soon as they are fed 
into the machine. This operator is defined by 


(8) (Dy = 7 DS We — XY. 


It should be noted that the ‘rearrangement operation’ may be 
~applied at different stages. In the five examples shown in Table 5 it 


*It will be seen that the only purpose for the ‘rearrangement’ of the xi;(t) by the factors I and Tis 
to facilitate the subsequent summation Di’ of the wi;(t) over i for constant values of t and subsequent 
operation Di’, On most machines, the lengthy operation involved here would be the internal scanning 
of the stored code ‘i’, With rearrangement one scanning would be required whilst if operations ;’ and 
Dj! were carried out directly without rearrangement two scannings will normally be required. 
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is applied as follows:— 


Design :— Rearrangement applied to quantities: 
Latin Square D; Dy xy; 
Incomplete Blocks Dy, Lox 
Youden Square D; Dees 
Triple Lattice Dj 451; 
Factorial 2° (confounded) Lee 


In the last design, therefore, the data are fed into the machine 
arranged by r (Replicates) and treatment combinations (abc). They 
will be immediately rearranged by the 3 new factors, i.e. by r (Replicate), 
g (Treatment Group) and ¢ (No. of Treatment Combination within 
Group). If (say) ABC is the confounded effect then this arrangement 
of treatment combinations is as follows: 


No. of Treatment in Group 


Treatment Group 
g=2 abe c b a 


The subsequent operations are then completely standard. For the 
arrangement by the Ist factors we have an ordinary (non-confounded) 
analysis like that in the last example of Table 4, regarding Blocks as 
Replicates, for arrangement by the second set of factors we have a 


standard split plot analysis. The two analyses should check on the 
following components: 


First Factors Analysis:— Second Factors Analysis:— 
(A)-+(B)+()+(AB)+(AC)H(BO) = (L) + (T XG) 
(ABC) = G 
(9) , R = R 
‘Error’ = 


Error (a) + Error (6) 


Finally we should mention that the analysis of variance for the 
‘Balanced Incomplete Blocks’ as set out in Table 5 gives the one for 
treatment comparisons without recovery of inter block information 
whilst that for triple lattices is the one providing the weights for treat- 
ment-mean comparisons. There is no inherent difficulty in program-- 
ming for the alternative analysis in either design. 
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4. A universal missing plot formula 


It is well known that an orthogonal design in which one or several 
results are missing can be essentially reduced to standard orthogonal 
analysis with the help of what are known as ‘missing plot formulas’. 
Most of these formulas are for a single missing plot and have arisen by 
minimizing the appropriate ‘Error Sum of Squares’ as a function of 
the unknown missing value. The resulting ‘missing plot formulas’ 
differ from design to design and the use of these formulas for high-speed 
computing would require a different missing plot program for every 
design. In order to avoid this we offer here a universal program appli- 
cable to all designs:— 

Assume then that the standard program for the orthogonal analysis 
of variance (i.e. without missing values) is already programmed and 
let Q denote the appropriate ‘Error Sum of Squares’. Denote by 
@ , a, and a, any three trial values for the missing plot, unity apart, so 
that a; = a;_, + 1, and by Q, , Q, and Q, the corresponding ‘Error Sum 
of Squares’ computed from the observed results supplemented by 
Gy , a, and a, respectively. The universal formula for the missing 
value is then given by 


(10) a=a+(Q — Qz2)/2(Qo — 20: + Q,) 


which is in fact that for the a-scale value at which the parabola through 
the three points (do , Qo), (a1 , Q,) and (a2 , Q.) attains its minimum. 
It will be noted that the only addition to the standard program for analy- 
sis of variance is formula (10), if one is satisfied with the ‘approximate 
analysis of variance tests’, i.e. with supplementing the data by ‘the 
missing value’ a, computed from (10), and using the standard analysis 
of variance program with the error degrees of freedom reduced by 1. 
By a similar method it is possible to program the ‘exact’ missing plot 
analysis resulting from the likelihood ratio principle. Likewise it is 
possible to cover the case of several missing plots using (10) and the 
customary iterative procedure. 


5. The limitations of the present program 


It will be noted that certain Analysis of Variance situations are not 
covered by the present program. Notably these are cases of unbalanced 
non-orthogonal designs. In such cases the complete analysis usually. 
requires a basic Least Squares model in which all levels of all factors 
are represented by ‘unknown effect constants’ to be fitted to the observed 
data. Such a computation can, of course, be covered by a general 
‘Least Squares’ program comprising the formation of the ‘normal 
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equations’ and their solution. This part of the analysis (i.e. the estima- 
tion of effect constants) is comparatively straightforward and follows 
the principle set out recently by Tocher (1952). However, the final 
analysis of variance tests depend on a hierarchical sequence in which 
the factors are to be arranged. 

Finally we should like to stress that the basic principle of the present 
approach is to reduce the analysis of a given design, by minor additional 
program instructions, to the standard program. Indeed, for the more 
complex designs here discussed it may be argued that the ‘standard’ 
part of the program consists merely of standard ‘subroutines’ in the 
form of the operators D, =, ( .)’ and the rearrangement. Whilst these 
latter could be used for every design a different ‘summary’ or ‘steering 
program’ would be needed to organize the operations in the correct 
sequence. For the bulk of the simpler designs, however, most of the 
program would consist of the standard subroutines. In the case of a 
very complex experiment it would, therefore, defeat the principle of 
this approach if very elaborate program additions are found to be 
required to reduce the analysis to the standard. In such cases it will be 
wiser to revert to a known desk computer layout and analysis, partic- 
ularly if it is unlikely that the design will be used frequently. 
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NOTE ON FITTING THE MULTI-HIT SURVIVAL CURVE* 
Joan M. Gurtan 


Division of Biological and Medical Research 


Argonne National Laboratory, Lemont, Ill. 


The expected proportions (S) of a population of micro-organisms 
surviving irradiation dose x has been described by 


o = ibe). 


Kimball [1] proposes a graphical method for evaluating the constants 
k and n in this expression. He defines his dependent variable 


u;-= In (1 — S,) (ea eh 2 


where S; is the observed proportion surviving, and minimizes the 
quantity 


V =>) fe =n ine =— 5 
t=1 

with respect to k, by a graphical procedure. This method has the 
advantages of being simple, requiring no arbitrary approximations, and 
permitting the use of all data. However, application of the transforma- 
tion used in this procedure, without making corresponding adjustments 
of the weights, will give the larger S; unduly heavy weight. This 
results in a poor fit at the lower values of S; , if S; varies by several 
orders of magnitude. 

Survival experiments are usually designed to give equal percentage 
plating error for all S; or constant variance o” for all log S;. Ifu,; = 
In (1 — S,) is used as the dependent variable the sampling variance 
of the dependent variable is no longer constant, but is approximately 
equal to o°[S;/(1 — 8S.) [2]. The theoretical weight for u; is then 
(1 — S;)/S;]’. Errors in dose, dilution, etc., are often propagated as S; 
decreases [3], hence these effects must also be taken into consideration 
in the choice of the weighting function. Ideally the appropriate weights _ 
should be derived empirically from a properly designed experiment; but — 
prior knowledge may be used to give a rough estimate of the weights, 
which may prove adequate for the evaluation of the parameters. 


*This work was performed under the auspices of the U. 8. Atomic Energy Commission, 
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TABLE 1 
Formulae for Approximating n and k 


Quantity Kimball’s formulae Weighted Kimball’s formulae 


k Graphical estimate Graphical estimate 


Pp Dp . 
De te Dy WUD; 
4= wr 

Dp 

DY 

i=1 


v; = n(l —e* 


| v; = In(1 — e**) ‘ 
‘ 
V eae » ua, ; + ww? — 1 YS wy; 7 
a=] ; t=1 i=1 : t=1 
8 s'B > s’B 3 
. ABS C AB — C? 
3? ee i s’A : sA 
eta AB —C? 
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Table 1 contains in Column 1 Kimball’s formulae used in the approxi- 
mations of n and k, and the sampling errors of their final values, and 
in Column 2 the corresponding modified formulae. The parameters k 
and n are determined by the same method of iterative approximations as 
in the original procedure. The only increase in labor and complexity is 
that due to the presence of differing weights. 

Table 2 contains the constants and their errors as computed by the 
two procedures, using the data of Dr. S. Pomper on the survival of 
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Saccharomyces cerevisiae, which were employed by Kimball. The 
weighting function used is w; = (1 — S,)*. It is based on the assump- 
tion that in this experiment errors in dilution, etc., increase with de- 
creasing S; in such a manner, that they offset the decreasing theoretical 
sampling error, so that the variance of S; is constant. 

Figure 1 shows the data as fitted by the two methods. 
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QUALITY EVALUATION BY NUMERICAL AND SUBJECTIVE 
METHODS, WITH APPLICATION TO DRIED VENEER 


W. G. Kauman, J. W. Gortsrern, anp D. LAntIcan 


Officers of the Division of Forest Products, C.S.I.R.O., 
Instructor in Wood Technology, University of the Philippines 


I. INTRODUCTION 


Very different schemes have often been proposed for the evaluation 
of perceptual entities. Advocates of physical measurement, either by 
instruments or by unambiguously defined scoring systems, claim that 
any method of subjective evaluation will suffer from variability between 
and within observers, and from difficulties in training observers other 
than by personal tuition. Those advocating subjective evaluation 
schemes retort that for many of the quantities in question, physical 
measurement would be so involved as to be impracticable, and that the 
results, though reproducible, would often be meaningless for the intended 
purpose. 

If a simple, straightforward method of physical measurement. is 
available, it would clearly be a waste of time and effort to substitute 
subjective evaluation, as, for instance, for the evaluation of weight or of 
electric current intensity. On the other hand, quantities or properties 
whose physical background is very complex or poorly understood, but 
which form well-defined sensory concepts, such as taste flavors or the 
notion of personal comfort, would appear more amenable to subjective 
evaluation than physical measurement. The use of subjective methods 
for taste testing is so widespread as to require no comment. With 
regard to personal comfort Muncey (1954) showed, for instance, that 
the subjective comfort—rated on a linear scale—may be used as satis- 
factorily as foot temperature to assess the “thermal comfort” of different 
floor materials. 

In many instances, quality judgments are based on a number of” 
more or less understood physical variables, the assessment depending 
to a large degree on the weighting and integration of the individual 
factors. Perceptive concepts of this type are considered by some workers 
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to be of the nature of a “‘Gestalt’’*; for example the “firmness” of rubber 
samples (Scott Blair and Coppen, 1942) or the “tackiness” of glues 
(Strasburger, 1953). The perception of faults in veneer would also 
appear to be of “‘Gestalt’’ character, the observer forming his subjective 
judgment of the quality by a perceptively integrated impression rather 
than by conscious observation and summation of individual defects 
present. 

Objective numerical evaluation of ‘Gestalt’? properties requires 
analysis into dimensionally simple components which are amenable to 
physical measurement. Scott Blair and Coppen (1942) consider that 
such an analysis by the physicist is “‘apt to destroy much of the original 
concept’’, and, at best, ‘“‘can offer but a blurred picture of the subjective 
original”. 

Any gain in consistency and reproducibility of results provided by 
a numerical method based on measurements may thus be vitiated by 
the experimenter losing sight of the original property, quite apart from 
the considerable complexity of procedure which is often entailed. 

On the other hand, if a subjective evaluation method is chosen, the 
experimenter must appreciate the importance of the mental attitude 
(Scott Blair and Coppen, 1940) and strain and fatigue of the observers 
(Harper, 1952; Hopkinson, 1952; Wyatt and Langdon, 1932), and of 
proper randomization of samples (Harrison and Elder, 1950; Hopkins, 
1950; Sheppard, 1953). 

It is also necessary to select carefully the actual evaluation technique 
to be used. Bradley (1953) advocates wider use of ranking methods, 
but Hopkins (1950) considers this procedure of advantage only when 
samples are few in number or can be grouped in small sets. For veneer 
quality evaluation, ranking methods would be highly impracticable 
because of the large number of samples usually involved, as would also 
be paired comparisons which are commonly used in tasting techniques 
(Bradley, 1953). Scoring techniques which allow independent assess- 
ment of individual specimens and facilitate standardization of quality 
ratings would appear clearly preferable in this case. 

When selecting an evaluation method for a particular application, 
it should be appreciated that the method need not yield 100 per cent 
accuracy, nor is it necessary for every single judgment to be correct. 
It is sufficient “that the mean of a convenient number of judgments 
should not drift, and that the variance of the judgments should be 
small in relation to the differences in which (the) experiment is inter- 
ested”? (Hopkinson, 1953). 


*The term ‘‘Gestalt” was introduced into psychology by C. Ehrenfels in 1890 to indicate the char- 
acter of perception as a unity (Hncycl. Britt. 1947). 
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It would therefore appear that, @ priori, numerical systems have no 
particular advantage over subjective evaluation methods. The selection 
of a method is largely a matter of defining the desired maximum level 
of variance, and having decided the permissible variance, of considering 
which evaluation method requires least time to produce results of the 
desired accuracy. The actual choice of a method will, of course, be 
influenced by additional arguments, such as the cost of apparatus, 
availability of trained personnel, relative difficulty of training observers, 
and so on. In most practical cases, it will be necessary to base the final 
choice on the statistical analysis of an exploration experiment such as 
the one described in the present paper. 


2. SCHEMES FOR EVALUATION OF THE QUALITY OF DRIED VENEER 


The success or failure of experimental veneer drying schedules is 
usually assessed in terms of the “dried quality” of the veneer produced. 
It is therefore of great importance to evaluate quality by a simple, 
unambiguous method which should give accurate and reproducible 
results. 

The “‘dried quality” of a sheet of veneer may be defined as the relative 
freedom from drying degrade*, with careful specification of the types of 
degrade considered. Evaluation of dried quality is thus largely a 
problem of assessing degrade. 

The assessment may be made either by a method of numerical 
measurement of the amount and severity of the different degrade’ 
types present, or by a system involving subjective judgment (Ellwood 
1952). Quality evaluation schemes described in the present paper have 
been developed for application to “ash” type eucalypt 38 X 38 in.— 
veneer sample sheets of 1/16 in. thickness (green dimensions). Although 
the great majority of species used for veneers can be dried with little 
degrade using quite a wide range of drying conditions, in the “ash” euca- 
lypts drying conditions are extremely critical and the remarkable amount 
of degrade which may occur makes assessment of effects difficult and_ 
necessitates the use of special evaluation methods. Application of the 
schemes to veneer of other species, dimensions or thickness offers, how- 
ever, no major obstacles in principle and will be discussed in a later 


section. 


The schemes were conceived as a compromise between the require- 


ments of conformity to commercial standards and ease of application in 
experimental research work. They are intended for research purposes 


*Defects such as checks, splits or buckling arising in timber during drying are commonly termed 
sc degrade’’. * 
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only, and not to supplant grading schemes commonly used in commerical 
practice. 

In “ash” eucalypt veneer, it is convenient to distinguish four principal 
types of drying degrade, comprising, in order of decreasing importance: 


(i) Through-checking 
(ii) End splitting 
(iii) Face checking 


(iv) Buckling equal 


Buckling in these species is largely due to differences in collapse* 
in early and late wood zones.and can usually be reduced to a marked 
degree by a steaming treatment known as “reconditioning”’; it is there- 
fore considered a minor form of degrade. 

Face checks (i.e. checks in the tight face not penetrating through the 
whole thickness of veneer) often close up during the manufacture of 
plywood and become inconspicuous; they too are considered minor 
degrade. 

End splits are usually a moderately severe form of degrade, but may 
be rated more or less severely according to their constitution. End 
splits with flat and adjacent edges may become practically invisible 
when the veneer is made up into plywood, whereas end splits with curly, 
widely separated or overlapping edges cause severe and often irreparable 
defects. 

Through-checks (i.e. checks penetrating through the whole thickness 
of veneer) generally constitute a severe form of degrade, except when 
they are of the nature of extremely fine hairline checks. Through-checks 
which have an appreciable width cause the glue to well up and spread 
on the surface during pressing, thus spoiling the appearance of the 
plywood. If the through-checks have curly edges, UE cause severe 
overlapping defects. 

Although through-checking, as well as face checking, is usually of 
minor importance in veneer from most species, both these degrade 
types can cause very severe degrade in the ‘‘ash” type eucalypts. 

The amount of degrade of each type present is classified into the 
following six “severity classes’: 


None 

Very slight 
Slight 
Moderate 
Severe 
Very severe 


orwWwNnNnr © 


*Collapse is an abnormal form of shrinkage caused by caving in or “collapse’’ of the cell lumina. 
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The evaluation schemes are intended to assess only degrade caused 
by the drying treatment. Any defects present prior to drying, and 
drying degrade clearly caused by blemishes such as knots inherent in 
the timber are therefore disregarded. 

The final result of the evaluation is termed the “quality rating”’ of 
a veneer sheet. It is obtained by arithmetical or mental summation of 
the severity classes (according to the scheme used), and provides a 
quantitative measure of dried quality. 


2.1 Numerical Evaluation Scheme 


In the numerical scheme the allocation of severity classes for checks 
and splits is based on the measured total combined length of each type 
present, as shown below: 


Total combined length of 
all checks or splits of Severity 
the type being assessed* class 


0 

Under 2 in. 

2 in. to 6 in. 

6 in. to 12 in. 

12 in. to 24 in. 
Greater than 24 in. 


or WN eH © 


*For 1/16 in. thick 38 XX 38 in. veneer sheets. 


No satisfactory method of measurement could be found for buckling, 
and classification of this degrade type is therefore based on an estimate 
of the amount and average amplitude of the corrugations in the veneer 
sheet. The amount of subjectivity introduced by this procedure is 
very small since buckling makes only a minor contribution to the quality 
rating. ee 

Each degrade type is associated with a “numerical weight’? which 
indicates its relative importance for the overall quality assessment. 

The ‘‘degrade rating” of a degrade type is obtained by multiplying 
the severity class by the numerical weight. Table 1 summarizes degrade _ ; 
types with their respective numerical weights, and also gives the maxi- 
mum degrade rating of each type, obtained by multiplying the numerical 
weight by the highest severity class, i.e. by 5. 

In samples where several sub-types of a particular degrade type are 
present, the total severity class must not exceed 5 and is allotted accord- 
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ing to the total combined lengths of the checks or splits of all sub-types, 
but giving full effect to the sub-type having the greatest numerical 
weight. 

If any type of degrade is localized (i.e. contained in not more than 
two regions whose total combined area does not exceed approximately 
1 of the area of the veneer sheet), the corresponding degrade rating is 
multiplied by 2, rounding off to the nearest integer. 

The “quality rating’ of a veneer sheet in the numerical scheme is 
the sum of the degrade ratings of all degrade types present. A perfect 
sheet completely free from degrade will thus have a rating of 0, whereas 
a very bad sheet containing a maximum of each degrade type will have 
a rating of 50. 


TABLE 1 


Numerical Weights and Maximum Degrade Ratings of Different 
Degrade Types 


Maximum 
Degrade Degrade sub-types Numerical} degrade 
types weight rating 
Buckling == 1 5 
Face checking — 1 5 
End End splits with flat and adjacent edges. 2 
splitting 
End splits with slightly curled and/or 
widely separated edges. 3 
_Iind splits with severely curled and/or 
overlapping edges. 4 20 
Through- Hairline through-checks 2 
checking. ©, _-_--—- 
Through-checks of width less than 1 mm., 
except hairline checks. 3 
Through-checks of width greater than 1 
mm and/or with curled or overlapping 
edges. 4 20 
Maximum quality rating* 50 


*Quality rating 50—‘‘very bad” veneer sheet with maximum Geatade: 
' Quality rating O—“‘excellent” sheet free from degrade. 


~~ =. 
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2.2 Subjective Evaluation Scheme 


In the subjective scheme, degrade is classified into severity classes 
by visual estimation. In the case of buckling this is again based on an 
estimate of the amount and average amplitude of the corrugations 
present, whereas checks and splits are judged on the basis of the number 
present, the distribution over the veneer face and the average length 
and width of individual checks or splits. As there are no “numerical 
weights’ to be taken into account, the ‘‘degrade rating”’ in the subjective 
scheme is synonymous with the “severity class’’. 

The quality of the sheet is assigned to one of the following nine 
“quality ratings’’: 


0 Excellent 
1 Very good 
2 Good 
3 Very fair 
4 Fair 

_5 Poor 
6 Very poor 
7 Bad 
8 Very bad 


The quality rating may be obtained either by mental integration of 
the previously established severity classes or, if no record of severity 
classes is required, by direct subjective judgment based on visual inspec- 
tion of the sheet. The latter of these methods should, however, only 
be used after some experience has been gained with the former. 


3. EXPERIMENTAL PROCEDURE 


Twenty sheets of 1/16 in. thick veneer (green size 38 X 38 in.) 
from five trees of alpine ash (Hucalyptus gigantea Hook f.) were selected 
from material included in a study of the mechanical drying of ‘‘ash”’ 
type eucalypt veneer (Kauman, Gottstein and Lantican, 1956). The 
sheets had been selected to include, as far as possible, all types and 
severities of degrade and quality ratings. 

Three observers (A, B and C) each carried out two quality evaluations 
on these twenty veneer sheets, using first the subjective and then the 
numerical scheme. Several days were allowed to pass between the first 
and second evaluation by each observer, and the order of the sheets was 
changed between tests. No observer was permitted to see the others’ 
results, or to be present during the others’ Pale before he had completed 
all his own tests.. 
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The procedure for the subjective scheme was first to assess the 
quality rating, and then to estimate the severity classes of the various 
degrade types present. In this manner it was hoped to obtain quality 
ratings which were independent of the individual severity classes. 

Observer A was most experienced, having completed a quality 
evaluation of some 340 sheets of veneer by both schemes immediately 
prior to the present experiment; B had acted as recorder to A during 
this evaluation, but had not himself carried out evaluations, whereas 
C had no previous knowledge of the schemes and carried out his evalua- 
tion after a brief training period. Observations were recorded by an 
assistant, and the observers did not consciously remember previous 
results during their second evaluation tests. 


TABLE 2 
Results of Numerical Quality Evaluation 


Quality Ratings 


Sheet Observer A Observer B Observer C Mean 
no. a 


Test Test Test Test Test Test 
No. 1 No. 2 No..1 No. 2 No. 1 No. 2 


1 22 15 21 24 19 11 18.7 
2 31 23 24 21 29 33 26.8 
3 34 37 32 35 40 37 35.8 
4 44 44 41 44 42 40 42.5 
5 10 6 13 19 8 a 10.5 
6 31 32 34 39 34 25 32.5 
7 42 39 39 41 25 30 36.0 
8 28 19 23 27 13 24 22.3 
) 29 29 27 27 28 25 27.5 
10 39 37 29 35 34 36 35.0 
ii! 21 20 17 25 16 23 20.3 
12 42 44 44 40 41 38 41.5 
13 40 39 34 37 34 31 35.8 
14 bn 9 9 12 u 8 9.3 
15 38 37 30 36 44 40 37.5 
16 ) 13 14 25 it 12 14.0 
17 20 16 19 19 6 15 15.8 
18 28 31 30 3l 24 37 30.2 
1) 28 24 25 30 25 22 25.7 
20 15 8 11 ily ) 10 12.0 
Mean 28.1 26.1 25.8 29.3 24.4 25.2 


a rs 


QUALITY EVALUATION 135 


4, RESULTS 


Quality ratings evaluated by the numerical and the subjective 
method are given in Tables 2 and 3. Table 4 summarizes the mean 
degrade ratings for all degrade types evaluated by both schemes. 
Detailed evaluation results of degrade types by the different observers 
are not reproduced here. 


ae 


TABLE 3 
Results of Subjective Quality Evaluation 


Quality Ratings 

| Sheet Observer A Observer B Observer C _| Mean 

no. SS SSS OS SS eee 

Test Test Test Test Test Test 
No. 1 No. 2 No. 1 No. 2 No.1 No. 2 

| 1 3 2 3 yore ae ee JES 12 
| 2 7 Zz 5 4.5 ci 5 5.9 re 
) 3 6 5 5 5 5 is. 
4 rs 7 8 Se ie Ox awa 782 
5 1 1 2 2 3 ee leg ee 
6 5 5 io 6 5 Sates 

7 5 6 6 ee on) 5 5.5 

8 3 as.) 5 4 Ee ee 3.8 

9 ts hae £ 5 5 5 ee Bee 

ties |e 86: 6 weer’ 6 LeaAT. ieee | 6.5 : 
; it 4-5 4 | 4 eee Soy cee 4 Rn 4 
; 122 | 8 eat 2 wd 8 eSB eye a cant ee) fe ee 

CE ee oy ee med ee fem PR dane aie eee)? ea | ot 

oye seg ame ere ares She antec ag > 

| OTS TOSS aig ae ee ae Oa a Cg Wie tem (tee eae) ‘ 
Bis 1 ae eee ake We ly eae e 
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TABLE 4 


Mean Degrade Ratings 


Sheet Numerical scheme Subjective scheme 

no. ; 
Be Oe E.S. PO. B. PLC: E.S. TOK 

il 2.0 Bud 6.5 6.4 2.2 21 2.0 2.0 
2 4.5 2.8 4.7 14.3 4.8 2.3 2.2 3.4 
3 2.2 320 Sad 14.5 2.4 2.8 3.6 3.5 
4 ibs 4.0 18.3 18.7 1.8 3.23 4.0 4.4 
5 ilge7 1.4 5.2 2.3 Dra Oot 1G 0.5 
6 1.8 3.2 15.8 ihe 2.1 2.8 3.3 2.8 
7 DAG 30 15.8 14.0 ee 2.8 Bo! 3.4 
8 2.0 Dae 9.2 8.5 Dee Pat Diy PA 
9 28 Bi 1320 8.5 2.4 225 Sy 2.6 
10 isis 30 16.5 1355 1.6 Bere 4.2 Ae, 
il 34,2) PO te G oP 4.2 2a: pa 2.0 
12 DD 4.0 16.5 18.2 2,2 Sah 320 4.3 
13 De2 4.0 1750 1253 DD, 3.4 58) a2 
14 Lew ese, 4.7 1.8 128 0.5 Lee 0.7 
15 220 a 13 20.0 2.6 3.2 mae ahi 
16 DED Pa 6.3 2.8 2.5 1.8 2.0 16 
17 PASE 3.0 4.3 6.8 2.8 1.3 1.8 2.0 
18 1.8 3e0 ke} ¥¢ hers 1.8 ou2 art Se 
19 2.0 Pied 1053 LOW 2.3 2.8 3.2 2.8 
20 15 2.8 Den, es 1.4 Po. C7. 2.3 


Abbreviations: 
B. = Buckling 
F.C. = Face checking 
H.S. = End splitting 
T.C. = Through-checking. 


Figures 1 and 2 illustrate typical veneer sheets of “good” and “very 
bad” quality, whereas Figure 3 shows the appearance of severe through- 
checking viewed by reflected and transmitted light. (Viewing by 
transmitted light was found to be most effective for detection of fine 
through-checks). The correlation between the numerical and subjective 
quality ratings is shown graphically in Figure 4. ; 

A comparison of the standard deviations with the mean quality 
ratings for all specimens is given in Figure 5, each rating being the 
average of all observations by the three observers. It is obvious that 
the scattering of results is essentially random in nature, although it 
would appear that in the numerical scheme, low quality sheets had 
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FIGURE 1. A “‘Good” quality veneer sheet (Sheet No. 5) 


Numerical Scheme Subjective — 
Scheme 
= Sev.Cl. Num. Wt. Deer. Rat. 

Buckling i 1 J Slight 

Face Checking 1 uf a Very slight 
End Splitting 2 Bex 2/3 6 Slight 

ne 2 

Through-Checking 1 3X 2/3 2 Very slight 

_ —_— 10 Good 


Quality Rating 


xz = Splits present prior to drying 
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as Cc aoe ae ee eve ea 
pose rae eesti by 


ek 


FIGURE 2. A “Very bad” quality veneer sheet. (This sheet was not included in the present 


experiment,) 


Numerical Scheme Subjective 
SS ——— Scheme 
4 : Sev. Cl. Num. Wt. Degr. Rat. 
Buckling 1 i A Very slight 
Face Checking 4 1 4 Severe 
End Splitting 4 4 19 Severe 
1 3 
Through-Checking 5 4 20 Very severe 


Quality Rating 44 Very bad 
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(b) 


FIGURE 3. Severe through-checking and moderate face checking, viewed 


(a) by reflected light 


(b) by \transmitted light. 
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RATINGS. 


QUALITY 


SUBJECTIVE 


MEAN 


20 25 30 35 40 45 50 
MEAN NUMERICAL QUALITY RATINGS. 


FIG. 4, CORRELATION BETWEEN SUBJECTIVE AND NUMERICAL 
QUALITY RATINGS. 


smaller average standard deviations than higher quality ones. This 
trend is contrary to what one might expect, since low quality sheets 
offer more scope for error due to the higher numerical weights they 
usually carry on end splitting and through-checking. It might indicate 
that observers were more certain and more nearly unanimous in their 
judging of a sheet with a large number of serious defects than one with 
-asmaller number. It is, however, possible that had there been a number 
of very good to excellent sheets, they would have been judged with 
greater assurance (note position of the point for sheet 14 in Fig. 5a). 

Table 5 shows the mean absolute differences between the first and 
the second degrade ratings by each observer for all degrade types in 
both schemes, and also the mean absolute differences between the 
quality ratings. Even though the numerical degrade ratings for end 
splitting and through-checking are multiplied by a numerical weight 
factor of 2, 3, or 4, so that their differences are of a greater magnitude 
than those for buckling and face checking, it would appear that all 
observers had more difficulty in being consistent in their numerical 
assessment of end splitting and through-checking than in assessing 
buckling and face checking. The effect of experience seems to be 
evident in a comparison of the mean difference of the numerical quality 
ratings of observer A (3.3) with those of the other observers (4.2 and 
4.6); the mean difference is smallest for the former. This is, however, 
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MEAN. SUBJECTIVE 


141] 


50, 


40 
30 
20 


oi en 
SHEET 14 


2 


3 a 5 
STANDARD DEVIATION. (NuMeRicaL) 


Oo 
0.2 0.4 0.6 0.8 1.0 12 


STANDARD DEVIATION . (susvective ) 

FIG 5 RELATION BETWEEN QUALITY 
RATINGS AND STANDARD _ DEVIATIONS. _ 
(a) NUMERICAL SCHEME 


(b) SUBJECTIVE SCHEME 
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TABLE 5 
Mean Absolute Differences between First and Second Degrade and 
Quality Rating Evaluation by Each Observer 


Observer A Observer B Observer C 
Degrade type 
Num. Sub. Num. Sub. Num. Sub. 
Buckling 0.4 0.3 0.4 0.2 O87 ORS 
Face checking 0.4 0.6 ual 0.5 0.6 0.4 
End splitting 1.8 0.4 3.6 0.4 2.8 1.0 
Through-checking 2.8 0.4 278 0.4 3.2 0.6 
Quality rating 3.3 0.7 4.2 0.4 4.6 0.7 
Num. = Numerical scheme; Sub. = Subjective scheme. 


not the case for the subjective ratings for which observer B obtained 
the greatest consistency. It seems rather surprising that the personal 
factor of experience should play a greater role in the numerical than in 
the subjective scheme, and this observation may suggest that the 
numerical evaluation is influenced by subjective factors. 

Further analysis to elicit the reason for the uncertainty in the 
numerical evaluation of end splitting and through-checking showed 
that all observers had marked difficulty in the determination of severity 
classes, even though these are nominally based on supposedly unam- 
biguous measurement of the total length of checks or splits. This is 
due to the fact that an observer who has to inspect a large number of 
sheets in a limited time cannot possibly accurately measure the total 
length of the checks or splits present, but has to rely on visual estima- 
tion to make-his assessment. The same applies to the numerical weights, 
which, in practise, were also often allotted by subjective judgment 
rather than detailed quantitative evaluation of degrade sub-types. 

There were also considerable inconsistencies in observers’ judgments 
of whether degrade was localized or not. 


5, STATISTICAL ANALYSIS OF RESULTS 


Analyses of variance were carried out on the experimental results, 


__ giving an assessment of the magnitude of effects of the following factors:— 


(i) Sheets (S) 
(ii) Observers (O) 
(iii) Repetitions by an observer  (R) 
(iv) Sheet-observer interaction (SOQ) 
(v) Sheet-repetition interaction (SR) 


oo trade) be soll & 
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A summary of the analyses for quality ratings in both schemes is 
given in Table 6. 

Of the variance values given in this table, the observer component 
in both schemes, and the repetition component in the subjective scheme, 
are non-significant, but all other components are significant at the 1 per 
cent level. 

It should be noted that the highly significant sheet-observer inter- 
action in both schemes is entirely consistent with the non-significance 
of the observer component. 


5.1 Variance of Sheet Means (Quality Ratings) 
The variance of sheet means, V, , in the present experiment was 
found to be 


ee ee) (1) 


The value of this expression for the numerical scheme is 


V,(@aum.) = a 8 SR ee ee 


so that the standard error of sheet means is 1.9. In the case of the 
subjective scheme, equation (1) has the value 
: 0.7928 


V.eub) = = 0.1321 > 


and the standard error of sheet means is then 0.36. — _ 

To predict the variance and standard error for any method of exam- 
g _ ination, we may write, by analogy, the variance of sheet means: if m 
a Wists each take n readings.on 0. yn. each gees ae Oe i 
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observers, the analysis of an actual test by a single observer would havé 
to be different from that given. It is clear that if there is only one obser- 
ver, no components of variance can be estimated for observers and 
sheet-observer interaction. 

The values of standard errors of sheet means (in per cent of the total 
range of ratings) for different numbers of observers and readings are 
plotted in Figure 6. It is obviously preferable to increase the number 


NUMERICAL SCHEME SUBJECTIVE SCHEME 


STANDARD ERROR IN PERCENT OF TOTAL RANGE OF RATINGS 


2 4 6 6 10 20 40 60 100 2 4 6 610 20 40 60 100 
NUMBER OF READINGS PER OBSERVER ‘ NUMBER OF READINGS PER OBSERVER 


FIG.6 STANDARD ERRORS OF SINGLE SHEET, MEANS OF QUALITY 
RATINGS 


of observers rather than to increase the number of readings per observer. 
Even the extreme case of one observer taking 100 readings will lead to 
a greater standard error than two observers each taking two readings, or 
three observers each taking one reading. 

The accuracy of the above formulae may be judged from the results 
of the drying study mentioned above (Kauman, Gottstein and Lantican, 
1956) in which observer A twice subjectively evaluated the quality of 
340 sheets from which the 20 specimens of the present investigation 
were later selected. The SR component of variance was 0.4225, com- 
pared with 0.3884 in the present results (Table 6). Using 0.2022 as — 
the value of the SO'interaction (no value is available for the one observer 
in the drying experiment), the standard error of sheet means in the 
drying experiment was 0.643, a difference of only 2 per cent from the 
predicted value (0.630). 
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5.2 Variance of Observer Means (Quality Ratings) 


Although the variance of sheet means is of greatest importance 
when defining desirable levels of accuracy and assessing the usefulness 
of an evaluation scheme, it is also of some interest to consider the vari- 
ance of observer means. The latter measures the internal consistency 
of results for any observer and is appropriate for testing the significance 
of the differences between observers. It could be useful in deciding 
whether observers can be changed in the course of an experiment. 

In the present experiment, the variance of observer means, Vo , is 
given by 


(SR) + 2(SO) + 20(R) 


io 40 


(3) 

Upon substitution of the components of variance, equation (3) 
yields for the numerical scheme 
Oa 


V,(num.) = ie cs 1.68 (Standard error 1.3) 


and for the subjective scheme 


0.7928 


V,(sub.) = 40 


= 0.01982 (Standard error 0.14) 


The general expression for the variance of observer means, for 
observers taking n readings on each of : sheets, is 


_ (SR) + n(SO) + pt) 
np 


Vo (4) 


the formula for the numerical scheme being 


Pos 5.61), 228 


V.(num.) aw ge 
np p n 
and for the subjective scheme 
Visio ue 0.3884 fe 0.2022 
np Pp 


As stated earlier, the observer means in the present experiment did 
not differ significantly. 

It should be noted that observer variance had to be calculated from 
three mean squares in the analysis of variance, unlike sheet variance 
which could be found from one mean square. Observer variances are 
thus not very accurate. 
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Figure 7 gives the standard errors of observer means (in per cent 
of the total range of ratings) for different numbers of sheets and readings. 
As would be expected, the variance of observer means decreases with 
increase in the number of readings. An appreciable improvement in 


NUMERICAL SCHEME SUBJECTIVE SCHEME 


STANDARD ERROR IN PERCENT OF TOTAL RANGE OF RATINGS 


2 4 6 8 10 20 


NUMBER OF SAMPLES EXAMINED NUMBER OF SAMPLES EXAMINED 40 60 100 


FIG.7 STANDARD ERRORS OF OBSERVER MEANS OF QUALITY RATINGS 


observer consistency may be obtained by increasing the number of 
readings from 1 to 2, but an increase in the number of readings beyond 2 
does not result in improvements of any great importance in either 
scheme, except for a very small number of sheets. 

The actual observer means of the quality ratings in both schemes 
may be found in Table 7. 


5.3 Results for Degrade Ratings 


The degrade ratings for the individual degrade types in both schemes 
were analyzed in the same manner as the quality ratings. Means and 
standard errors for each degrade type (and also for quality ratings) are 
given in Table 7. In the numerical scheme, observer differences were 
highly significant only for face checking. Significance at the 5 per cent 
level was found for buckling, but differences for end splitting and 
through-checking were not significant. In the subjective scheme,—— 
observer differences for degrade ratings were found highly significant 
for buckling, face checking and through-checking, but not for end 
splitting. The differences were such that observer differences for 
quality ratings were not significant, as noted earlier. 
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The standard errors are a measure of the variation between obse rvers, 
but also of the internal consistency of the observations by each observer. 
Thus, for instance, although the observer means for end splitting and 
through-checking in the subjective scheme are almost identical, the 
standard errors differ considerably, indicating that the internal consist- 
ency of observers in the evaluation of through-checking was considerably 
greater than for end splitting. No such differences in the standard 
errors were observed in the numerical scheme. 

The high significance of observer differences found for buckling and 
face checking might be expected since there is usually less variation 
within these two degrade types than in end splitting and through- 
checking. In other words, the consistency within observers, and hence 
the significance of observer differences should be higher for the former 
than for the latter. With the exception of the subjective degrade rating 
for through-checking, this was indeed the case. An alternative explana- 
tion could be that there was less variation in through-checking than in 
end splitting, but this seems Benoa borne out by the results of the 
numerical scheme. 

It is interesting to note the correlation between observer means in 
the two schemes. Observer C, for instance, was consistently high in his 
evaluation of buckling, but low for end splitting. B obtained high 
readings for end splitting and face checking, the latter particularly in 
the numerical scheme where he appears to have classified a proportion 
of through-checks as face checks. It might be noted here that differentia- 
tion between face checks and very narrow through-checks was reported 
particularly difficult by all observers. 


6. DISCUSSION 


The present experiment has shown that subjective evaluation can 
yield results of an accuracy approaching that of the numerical scheme, 
although the accuracy of the latter was slightly superior. 

If it is postulated that the maximum permissible standard error of 
single sheet. means for quality ratings should not exceed 5 per cent of 
the total range of ratings, then the standard error should be less than 
2.5-in the numerical scheme, and less than 0.4 in the subjective scheme. 

Inspection of Figure 6 shows that this condition is fulfilled in the 
numerical scheme for three observers taking one reading, or for two — 
observers taking two readings each. One single observer would have 
to take about 100 readings to obtain a similar accuracy, although 17 
readings will give a standard error below 5 per cent. 

In the subjective scheme, on the other hand, two observers would 
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have to take 4 readings, and three observers 2 readings each to obtain 
the desired accuracy. A single observer could never realize the postu- 
lated accuracy even with an infinite number of readings. 

More specifically, a comparison of variance ratios shows that in 
general, the subjective scheme requires 40 per cent more readings than 
the numerical scheme to give a similar accuracy. 

However, numerical quality evaluation, necessitating detailed rating 
of all degrade types, occupies considerably more time than subjective 
quality rating. The approximate average times were as follows: 


(i) Numerical evaluation-of quality rating (which in- 


cludes detailed evaluation of degrade ratings) 3 min 
(ii) Subjective evaluation of quality rating only 0.5 min 
(iii) Subjective evaluation of quality rating and degrade 

ratings 1.5 min 


Figure 8 gives a graphical comparison of the time required to realize 
a desired degree of accuracy by different numbers of observers and 
readings when using the various evaluation schemes. A logarithmic 
time scale has been used for convenience. The plots indicate that in 
applications where only quality ratings are desired, the subjective 
method, though slightly less accurate for a given number of observers 
and readings, is clearly preferable on account of the much greater 
rapidity of evaluation. 

If a detailed record of degrade ratings is desired, the difference in 
time required by the two schemes for quality evaluation by up to three 
observers is largely eliminated. If four or more observers take part in 
the evaluation, the subjective scheme has again the advantage of re- 
quiring considerably less time. 

Where it is inconvenient to use more than two observers, the numer- 
ical scheme has a margin over the subjective scheme as regards accuracy 
and time required. In addition, the numerical scheme records greater 
detail and would therefore permit finer analysis of the factors contribut- 
ing to degrade. 

Neither of the schemes appeared to have any advantage over the 
other regarding the relative difficulty and the time required for the 
training of observers by personal tuition, but it is considered that the 
numerical scheme would be more suitable for initial training by written 
instructions only, and for standardization purposes. 

As noted before, the quality evaluation schemes presented above 
were developed to deal with the drying of “‘ash” eucalypt veneers which 
are prone to considerable drying degrade and collapse. The schemes 
are, however, sufficiently flexible to be applied to other types of veneer 
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U OBSERVER’ 2 OBSERVERS 


NUMERICAL SCHEME 
_ SUBJECTIVE SCHEME SMALL FIGURES ON CURVES 
(EVALUATING DEGRADE & REPRESENT NUMBER OF 
QUALITY RATINGS) READINGS TAKEN BY EACH 


= SUBJECTIVE SCHEME OBSERVER 


(EVALUATING QUALITY 
RATINGS ONLY) 


4 OBSERVERS 


3 OBSERVERS 


STANDARD ERROR IN PERCENT OF TOTAL RANGE OF QUALITY RATINGS 


TIME REQUIRED PER SAMPLE FOR QUALITY EVALUATION WITH A GIVEN STANDARD ERROR 


FIG. 8 TIME REQUIRED FOR QUALITY EVALUATION 


and possibly to sawn timber; criteria for evaluation can easily be re- 
defined to take into account different forms of degrade. Belt 
It will, in general, be simpler first to re-define the numerical scheme 
for a given class of material, and to use the criteria thus obtained for the 
re-definition of the subjective scheme. In the former, it will usually be 
possible to place different degrade types in proper perspective by re- 
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allocation of appropriate numerical weights. It will be convenient to 
allocate the weights in such a way that the highest possible numerical 
quality rating is a round number (e.g. 50 or 100), although this is not 
absolutely essential. 

In some cases, it might appear desirable to change the criteria for 
severity classes. It would, for instance, be possible to use the total 
surface area instead of the total length of checks, but such a procedure 
would entail considerable difficulties of measurement. In cases where 
buckling is the major form of degrade, it might be deemed convenient 
to define more closely the criteria for severity ratings for this degrade 
type. 
_ Although observers found no apparent difficulty in dissociating 
defects present prior to drying from the actual drying degrade, it might 
be of value to evaluate quality ratings before and after drying, taking 
into account all defects present, and subtracting the former rating from 
the latter to obtain ‘“‘dried quality’. 

It might also be proposed to change the rating of a degrade type 

abruptly above a suitably defined threshold level. In the authors’ 
opinion, however, the evaluation schemes presented here are not suitable 
for defining a level of acceptance or rejection of a sheet in this manner. 
Their function is rather to provide continuous and reasonably uniform 
scales for the evaluation of degrade types in order to compare the effect 
of different experimental conditions. 
_ Having re-defined the numerical scheme to deal with a given set of 
requirements, it is an easy matter to apply the new definitions to the 
subjective scheme. In the latter, it might under certain circumstances 
be of value to increase the number of quality ratings; by such a procedure 
it might be possible to decrease the standard error. 

Although the association of subjective ratings with numbers greatly 
facilitates subsequent analysis of the results, it is considered that: the 
verbal designations of quality ratings (excellent, very good, etc.) and 
of severity classes (very slight, slight, etc.) should be used by observers 
until sufficient experience in using the scheme has been gained. The 
use of intermediate ratings, such as slight to moderate, or fair to poor, 
though a convenient way of escape from making a decision, should be 
discouraged. If observers find the number of ratings available in- 
sufficient, it would be better to increase their number, as suggested 
~ above, so that each rating can be associated with an integral index 
number. 

._ It may be considered desirable, under certain circumstances, to 
define quality rating so that the highest rating corresponds to the best 
quality. This can easily be achieved by re-defining the quality rating 


7 
' 
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as follows: 


Quality rating = 50 — D(Degrade ratings). 
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THE ESTIMATION OF AGE-SPECIFIC INFECTION RATES 
FROM A CURVE OF RELATIVE INFECTION 


P. WHITTLE 


Applied Mathematics Laboratory, New Zealand D.S.1.R. 


Summary 


Consider a sample of individuals (in which the age classes are not 
necessarily proportionally represented) from a population subject to 
some permanent infection. A procedure is described for the estimation 
of the infection rate as a function of age, and of the expectation of life 
after infection. ‘The method is applied to unpublished data supplied 
by P. C. Bull on the incidence of liver lesions due to coccidiosis (Himeria 
stiedae) in the New Zealand wild rabbit. 


(1) The model 


Disease and parasitism belong to those features of an animal popu- 
lation for which laboratory examination must be supplemented by 
sampling investigations in the field if a satisfactory picture of the 
natural state of the animal is to be obtained. The sampling approach 
is regrettably indirect, but can prove informative. 

Consider an animal population subject to some permanent infection 
with recognisable symptoms, and suppose the population divisible into 
suitable age categories, e.g. 0-1 weeks, 1-2 weeks, etc. Suppose now 
that a sample is taken from the population in such a way that for any 
age category healthy and infected animals will be represented in their 
correct proportions (apart from sampling fluctuations). The sample 
will presumably include animals of a range of ages, and we shall suppose 
that in the jth age category h; healthy and 7; infected animals are 
captured. 

If one could be sure that all age categories were proportionally 
represented, and if the expected mortality due to natural causes at all 
ages were known, then it would be a direct matter to estimate the rates 


~~at which healthy animals were being infected and infected animals 


were dying of disease. Failing this the relative proportions of infected 
animals in the different age groups 


(Ue h; Le i; (1) 


INFECTION RATES 155 


prove to be useful quantities. We shall call the graph of these pro- 
portions against age the curve of relative infection. 

We shall assume that an animal healthy at age ¢ has a probability 
\(é) dt of becoming infected in the age interval (t, ¢ + dé) so that d(E) 
is the infection rate for age t. We shall suppose that there is a similar 
natural death rate a(t) which applies equally to healthy and infected 
animals of equal ages. Thus, in the case of the rabbits which we shall 
consider later, there are the natural hazards to young rabbits of hawks, 
etc. Finally, we shall assume that animals which contract infection at 
age t-s have a death rate due to the infection of 8(s, t) (so that 8(s, t) dé 
is the probability of a life between ¢ and ¢ + di, or a sickness duration 
s and s + dt, given that infection was contracted at age ¢-s). 

A useful extension of the model would be to allow recovery after 
infection, but to attempt estimation of recovery rates as well as infec- 
tion and death rates is to ask too much of the data. It is only thanks 
to rather special circumstances and by introduction of specialising 
assumptions that we can extract as much information from the data as 
we do. Further, many animal infections are permanent in that once 
they reach a certain stage they can only end in death—the animal 
either dies of sickness or is so weakened as to fall an easy prey to its 
enemies. 

If we begin with a group of N animals born at time ¢ = 0 (so that 
t can be used to represent age or time indiscriminately) then it follows 
by conventional reasoning (see, for example, Feller p. 364) that the 
expected number of healthy animals remaining at time ¢ is 


H(t) = N exp & [ qrorcetol in} (2) 


The expected number becoming infected in the age interval (¢, t + dt) is 
K(t) dt = H(t)X(t) dt (3) 


whence the total number of infected animals living at time ¢ is 


a 


I) = [ K(u) exp er [av) + BY — u,2)] in} du 


= x exp{—[' ae) au} [exp {— | r09 a os 
_ ih BY — U, ») aubrey io 


The ratio of the expected numbers of infected and healthy animals of | 
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age ¢ is thus 


R(t) = a = if exp {f [A(v) — Bw — u, v)] aw}noo du (5) 


This ratio is independent of the natural death rate a(t), as might have 
been expected. 

The theoretical curve of relative infection, with which the empirical 
curve (1) is to be compared, is 

I(t) R(t) 
sues I(t) + H(i) 1+ R( (6) 

Now, we cannot estimate two arbitrary functions \ and 6 on the 
basis of a single empirical function p;. Only by restricting one or both 
of \ and 6 can we hope to obtain determinate estimates. In the next 
two sections we shall consider two such specialisations: the extreme 
cases of a fixed sickness duration and of a random sickness duration. 

Let it be noted, that we are not completely without information on 
the functions \ and 8. By their nature both are positive and are unlikely 
to have more than one peak. This knowledge may be turned to good 
account. 


(2) The case of fixed sickness duration 


Suppose that an animal contracting infection at age ¢ dies after a 
further fixed period of time L (it is easy to generalise to the case when 
the sickness duration L is a function of age at the time of contraction). 
Expression (4) then becomes 


I(t) = N exp {- [ cae in} / z exp 2 [ “r@) awbna du 
= N exp calp a(u) in exp ibe d(u) in} (7) 
‘— exp ie d(u) in 


where c(¢) is the earliest age at which a living infected animal of age ¢ 
could have contracted infection, and is thus given by 


c(t) = 0 =< -D) 
elt) Tp WS SD) 


The proportion of infection is then a simple function of X: 


BG) ae ees, i: i. k du) in} | (9) 


(8) 
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Putting 
Q(t) = —log [1 — P(t] . (10) 
we can invert (9) to obtain an expression for : 
A(t) = Q"(2) (t < L) 
A =AQV’H+AE-LD) G2 L) 


(11) 


In practice, if p; is known at (say) weekly intervals then equations (11) 
will be applied in difference form to provide a solution for \. Thus, if 


q; = —log (1 — pi) (12) 


and the expected sickness duration is an integral number of weeks, L, 
then solution of the equations 


Ai = Giant — G eT) 
Nj = Ppl ek Nj-1 eel) 


(13) 


will yield a set of constants 4, where i; can be approximately interpreted 
as the estimated probability that an animal healthy when 7 weeks old 
will contract infection in the coming week. (Rather, that an animal 
apparently healthy when 7 weeks old will show symptoms in the coming 
week, since infection will presumably not manifest itself immediately). 

If the expected sickness duration LZ is not known, then we must 
choose a value for which the solutions \,; of (13) are positive, and, pre- 
ferably, unimodal. In this way, one can fix an upper bound for the 
true L, since substitution of too large an ZL in (13) will produce an 
oscillating solution. 


(3) The case of random sickness duration 


We shall go to the opposite extreme in this section, and suppose the 
sickness death rate to be independent of sickness duration, so that we 
can write 


ls, ) = Bt (14) 
It is now readily verified that 


R(t) = a -{ pe { [ Cel eae iv} du (15) 


is a solution of the differential equation 


R(t) + (BO — IRM = AY, (16) 
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so that the solution for \(é) in terms of observed quantities becomes in 
this case 


R’(t) + BORD 
1 + R(d) oe 


In the previous section a knowledge of L (or L(¢)) was necessary to 
enable us to complete our solution: in this case we require correspondingly 
a knowledge of 6(t), the age-specific death intensity. We can, of course, 
suppose it equal to a constant and thus make possible an approximate 
solution for \. Just as the positivity of \ enabled us before to find an 
upper bound for ZL, it enables us now to find a lower bound for 6. We 
have from (17) 


A(t) = 


RD) 
Say Oe (18) 
In this way we obtain a lower bound for 8, but it seems that one cannot 
establish an upper bound (or a lower bound for the fixed sickness dura- _ 
tion L of the previous section) without making further restricting 
assumptions. We shall return to this question in the next section. 

We have framed our model in continuous time, as is natural. How- 
ever, since observations are taken at discrete intervals of time the 
differential relation (16) must be approximated by a difference equation. 
Thus, if we have observed values 7; of the ratio R(t) taken at unit 
intervals of time, then the obvious adaption of equation (17) would be 


Jee 3(rj 41 a r;) + 1G(r, 44 + r;) 
i+ i =- R(T j 41 ss r;) 


This is the estimating equation we shall use in the next section. 


B> 


(18) 


(4) An application to observed data 


The data came from an extensive investigation by Bull (now being 
prepared for publication) on the incidence of parasites in the New 
Zealand wild rabbit. The particular parasite considered is Eimeria 
stiedae, (Protozoa), which destroys the epithelial cells of the bile ducts 
and leads to the appearance of severe lesions in the liver (Becker, 1934). 

The sample was one of 4143 rabbits, whose ages were estimated 
from the paunched weight. (The weight/age curve for diseased animals 


~_ may well differ from that for healthy animals. However, this point has 


not been investigated and in this present analysis we can make no 
allowance for it:) Animals were counted as infected if more than half 
the liver showed severe lesions. Stephens (1952) referring to E. stiedae 
in young rabbits states that “it can be assumed that in severe attacks 
the animal generally dies before reaching maturity.” 
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TABLE 1 
Incidence of Parasites, New Zealand Wild Rabbit 


Age Estimated infection rate 
In Pp r re 
weeks B = 0.45 8 = 1.00 

3 0.090 FT 

0.093 —0.012 0.027 0.074 
4 0.080 

0.169 0.163 0.204 0.284 
5 0.200 

0.689 0.878 0.703 0.928 
6 0.530 

1.473 0.689 0.547 0.874 
7 0.645 

2.163 0.692 0.526 0.903 
8 0.715 ; 

2.716- - — 0.413 | 0.440 0.842 
9 0.745 ' Ue 

2.747 —0.351 0.236 ~ 0.639 
10 0.720 Me i ne age 

Siva St 0.798 0.059 0.435 
1l 0.640 a i 

1.453-—|  —0.650.-| 0.002 0.327 
12 0.530 a 

0.919 0.319 0.049 0.313 


—0.134 0.094 | 0.309 


0.115 | 0.078 | 0.265 
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TABLE 1—Concluded 


Age Estimated infection rate 
in p 1 iit 
weeks ice Or: £9) eS We, 
24 0.135 
0.143 —0.026 0.033 0.102 
25 0.115 
0.124 —0.013 0.038 0.099 
26 0.105 
0.108 —0.018 0.028 0.081 
27 0.090 
0.093 —0.012 0.027 0.074 
28 0.080 
0.076 —0.023 0.010 0.049 
29 0.060 


A graphical interpolation by weeks of the age-grouped data yielded 
the curve of relative infection in the diagram (full line), the pears D; 
are given in the second column of Table 1. 

An application of the method of section 2, assuming a constant 
sickness duration, was not at all successful. One had to choose an 
absurdly low value of Z in order to obtain a \ solution which would not 
oscillate violently. The reason for .this becomes apparent on an inspec- 
tion of the diagram. The curve rises much more quickly than it falls. 
If sickness duration were constant, however, the large increase of infec- 
tion in the 5-6 week period would be followed by a roughly corresponding 


P(t) = percentage’ of infection. 


—_—_— A(t) =estimated infection rate(#=0.45). dw)’ 


—— —~ A(t) Sestimated infection rate. = |) 


AGE IN WEEKS (f) 


CURVE OF RELATIVE INFECTION, NEW ZEALAND WILD RABBIT 
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decrease some time later when this age-group died off. The fact that 
there is no such decrease indicates that the hypothesis of a fixed sickness 
duration is untenable. In fact, the gradual decay of the curve indicates 
that sickness duration must be fairly broadly distributed. 

Let us consider the alternative hypothesis of a uniform death rate. 
In the third and fourth columns of the table we have recorded the 
values of 3(r;41 + 7;) and 3(r;,, — r;) which we shall take as estimates 
of R(t) and R’(¢) at the mid-week ¢ = 7 + 1. The maximum observed 
value of —R’/R occurs at 11} weeks: 0.447. The constant 8 must thus 
be greater than 0.447, but probably not much greater, since \(t) prob- 
ably assumes fairly small values once the animal has passed a critical age. 

For trial values of 8 we shall take 8 = 0.45 and 1.00, corresponding 
respectively to expected sickness durations of 1/0.45 = 2.2 weeks = 
15.4 days, and 1/1.00 = 1 week. (These figures, particularly the first, 
are quite consistent with laboratory investigations of EL. stzedae infection, 
which give a sickness duration of between 21 and 30 days (Becker, 1934). 
Since lesions appear in the liver about 10 days after infection, the actual 
time from appearance of the symptoms until death is of the order 11-20 
days. However, in Bull’s sample, rabbits were only counted as infected 
if more than half the liver showed severe lesions, and expectation of 
life for animals in which infection is so far advanced would almost 
certainly be less than 11-20 days.) 

The values of \ estimated using formula (18) with 8 = 0.45 and 1.00 
are recorded in the table and on the diagram. Neither of the two 
estimated curves conforms completely to our ideal of a smooth unimodal 
curve, but there are at least three reasons for this: 


(a) The model includes several rather drastic assumptions (in particular” 


the assumption that the death rate A(s, ¢) is constant can be valid 
only as a very rough approximation) and the fitting of an over-simplified 
model will produce systematic errors in the fitted curve. 


(b) The curve is subject to sampling fluctuations. These have been 


largely smoothed out in the present case, but may have some effect at. 


the extremes of the curve, since there were few rabbits in the extreme 
age categories. 


(c) Insmoothing a curve by eye one can unknowingly introduce spurious 


deviations, which will become more pronounced if the curve is differ-~ 


entiated. Further, if one substitutes difference quotients for differential 
coefficients, one introduces a bias which is negligible if the gradient is 
small, but which may be appreciable when the ordinate eee quickly 
as it does at ¢ = 5-6 weeks. ; 


— 
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The curve for 8 = 1 is more regular than that for 8 = 0.45, but at 
least part of this gain in regularity is illusory. If we let 6 become large 
in formula (17), we find that A(¢) becomes proportional to P(t). The 
same holds if we let L become small in (11). In other words, the larger 
8 or the smaller L, the nearer will the estimated ; curve approach to 
the original p; curve. In this case, as in most based on a large sample, 
the p; curve is itself regular (i.e. smooth and unimodal), so that as we 
choose a larger B or a smaller L we shall observe an increase in regularity 
of the estimated \; curve which is quite deceptive. 

One would naturally wish to have an objective method of reaching 
determinate estimates of the rate \ and of testing the final fit of the 
model. However, this will be possible only if one is justified in making 
a number of very restrictive assumptions (concerning both the functional 
forms of a, 6 and i, and the nature of the sampling fluctuations) or if 
one can take a sample which is inherently more informative (i.e. sample 
age groups proportionally). 

However, it may be asserted that the method as it stands gives one 
a positive indication of the upper limit of sickness duration; and a 
lower limit can often be set from one’s experience of the particular 
infection. In the case of the rabbit data considered in this section it 
was even possible to form some conclusions on the nature of the sickness 
duration (i.e. that it was broadly rather than sharply distributed). 
Finally if the distribution of sickness duration is known, one can evaluate 
the age/susceptibility function \(¢); while if little or no a priori informa- 
tion on sickness duration is available, one can still set plausible limits 
on the form of \(¢). Thus, in the present case, the true age/suscepti- 
bility curve for rabbits almost certainly lies between the two estimated 
curves. 


The author’s thanks are due to Mr. P. C. Bull for valuable advice. 
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AN EVALUATION OF THE REMOVAL METHOD 
OF ESTIMATING ANIMAL POPULATIONS* 


CALVIN ZIPPIN 


Cancer Research Institute, School of Medicine, 
University of California, San Francisco, California 


I. INTRODUCTION 


Moran (1951) presented a method for obtaining maximum likelihood 
estimates of population size from the results of a series of trappings in 
which the trapped animals are removed from the population.. This type 
of trapping program may be appropriate when auxiliary studies of the 
trapped animals are to be made which necessitate their eventual sacrifice, 
or when economic or health reasons make it inadvisable to return trapped 
animals to the population. In such situations it is not possible to use 
the tagging and recapture method which has been discussed by numerous 
authors in recent years (see e.g. Leslie and Chitty (1951), Bailey (1951)). 
Moran’s method, which will be called the removal method, is a special 
case of a more general procedure described by De Lury (1947). 

The present paper will cover the following topics: 

1. Development of a rapid graphical procedure for obtaining maxi- 
mum likelihood estimates of population size from removal method data. 

2. Determination of the asymptotic precision of the removal method. 

3. Report on the results of experimental sampling which was con- 
ducted in order to compare the maximum likelihood estimates with 
estimates by an alternative regression method (Hayne, 1949) and to 
study the performance of the removal method on small populations. 

4. Determination of the proportion of the total population which 
must be trapped in order to reduce the coefficient of variation of estimates 
of population size to specified levels. - 

5. Tests of the assumptions underlying the removal method and their 
application to actual trapping data. 


*From the Department of Biostatistics, The Johns Hopkins University School of Hygiene and 
Public Health, Baltimore, Maryland. Department Paper No. 311. This work was done under Navy 


Contract Nonr-248 (16). 
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II. THEORY OF THE REMOVAL METHOD 


The removal method assumes a stationary population during the 
trapping program and also that the probability of capture during a 
given trapping is the same for each animal and does not change from 
trapping to trapping. 

In this section the theory leading to maximum likelihood estimates of 
N, the population size, and p, the binomial probability of capture during 
a single trapping, will be reviewed for the purpose of introducing a rapid 
graphical method of obtaining estimates. New formulas for the standard 
errors of these estimates will also be presented. 


Conditional Binomial Approach 


Under the assumptions presented above, Moran applies the method of 
maximum likelihood to obtain conditional binomial estimates of the 
population size and of the probability of capture during a single trapping: 

Let 


N = population size, 
p = probability of capture during a single trapping, 
y; = number of animals captured during the zth trapping, 


and 
— g, = Dizi y; = total number captured prior to the 7th trapping. 


' The probability of capturing y; animals during the 7th trapping, 
given that x; animals had previously been captured is therefore 


eae vi N-2i-vi 
Pay |) = (7 * png" (1) 


where g = l-— p = probability of escaping capture during a single 
trapping. 

The joint probability or likelihood of the sample of catches actually 
observed in k trappings is 


T kN-Q N! 


P(S) = IT Pw Br ae (Nice lich tac nt (2) 


where 


k 


k k 
T= > y; and Qi 2 ri DY. 
+= i=1 


i=l 


This same result may be arrived at by looking at the problem slightly 
differently. We may consider the entire population as falling into 


(k + 1) categories—those captured during each of the k trapping | 


i deat basi i gy 


ANIMAL POPULATIONS 165 


periods and those escaping capture. The probability that an individual 
is captured during the first trapping is p. Under the assumption that 
the probability of capture does not change from trapping to trapping, 
the probability of escaping capture during the first trapping period and 
being trapped during the second trapping interval is pg. In general the 
probability of being captured during the 7th trapping interval is pg‘ ' 
and the probability of escaping capture during each of the k trappings is 
q’. According to the multinomial probability distribution, the. prob- 


Yilyo! --* Ye 


ability that y; , y2, ... , y; animals are trapped during the k successive 

trappings and that (NV — T) animals escape capture is given by 

) N! v1 Yo 2\ us k—-1\ uk ke a, f 
P(S) = iV my? PD aq)” «-: a); 


which is another form of equation (2). 
The logarithm of the likelihood of the sample of catches is then 


L = log P(S) = T log p + (kN ~ Q) log 7 


ee ‘OES 
+ log N1 — log (N- —T) =~ log y;!. 


Maximizing L with respect to p, Moran obtains as the estimator 
of the probability of capture an expression which is equivalent to 
7 S 
ee ae @ 
a ' kN = ye dag — = 


4=1 ; S 


_. For his estimate of N, Moran obtains the value N’ which satisfies 
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Substituting this expression for ¢ in equation (5) we obtain as the 
estimate of N that value NV which satisfies the equation 


i a - " 
Yee (6) 
N kN — ye ip: 

This equation allows an estimate of N without first approximating 
p, although the process of substituting values of N in both sides of 
equation. (6) until it is balanced is frequently quite tedious. 

It is of interest to consider a partition of the joint probability of 
the sample P(S). For any specified animal, the probability that it 
escapes capture during all k trapping periods is g". Since the chance 
of being captured is assumed to be independent from animal to animal, 
the total number, 7’, of animals captured follows the binomial (pr+ qr)”, 
where gr = g' and pp = 1 — qr. 

Hence the partition 


PS) = FS FORD) 


is found from equation (2) to be 


- 4) ( v0 va N! ky T oo 
oss a oa NE = ght tal eae) aaa 

(7) 
It may be noted that N appears only in the second factor P(T). This 
shows that the maximum likelihood estimate of N is obtained from the 
binomial distribution of the total catch T. If N is estimated by equating 
T to its expectation N(1 — gq"), it may be shown that this estimate will 
differ from the correct integral maximum likelihood estimate by less 
than unity. Hence, as pointed out by Moran, we may estimate N by 


< i 
Ns ae (8) 


which is an alternative form of equation (5). 

Further q and p are estimated from the first factor P(S | 7), the 
probability of the distribution of catches. For if we take the derivative of 
log P(S) with respect to p, the contribution from the second bracket in 

(7) is 
oe LM gees ON ete 

(le ag) q 
It is easily verified that this contribution vanishes when VN = 7'/(1 — Go 


These results form the basis for a quick graphical method of estimating 
p and hence N. 


ms 
Se 
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Setting the derivative of log P(S | 7’) with respect to p equal to zero, 
we find 

kon 

Geet ) 


=e ee 
E (9) 


k 
ee ee dX @ — Ny. 
= P — i c 


R may be calculated quickly from the data of a set of trappings. 
The value p which satisfies (9) is the maximum likelihood estimate of p, 
and @ = 1 — #. Substitution of T and ¢* in equation (8) provides the 
maximum likelihood estimate of N. 


where R 


Graphical Method of Estimating p and N 


Figure 1 has been drawn up to provide a means for rapidly obtaining 
p corresponding to a value of & calculated from the data of 3, 4, 5, or 7 
trappings. Estimates of (1 — q*) may be obtained directly from Figure 2. 


Similar graphs may be constructed for values of k other than those given — 


here. be S 


With the aid of these graphs, the procedure for obtaining an estimate 
of N is as follows: t 


1. Calculate — 


—— 


= a 


_k 
x aE ie “ 

Strip Ce | 
nate of (1 — gq") corresponding to R from appropriate 


— 
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k KG 

R=¥ (1-Ny/T R= D(i- Ny, /T 
er iz 

k = NUMBER OF TRAPPINGS=4 

yj: NUMBER IN i'” CATCH 


T = TOTAL CATCH 


+ k = NUMBER OF TRAPPINGS = 3 
y= NUMBER IN i'? CATCH 
T = TOTAL CATCH 
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where M, the variance-covariance matrix of N and p is 


OL aL 
-? op aN dp 
2 oh aL 
ON dp aN” 
In the above matrix 
Oli Ls aL 
aN? = aN? log N! ae an? log (NV az JS 
oL tT, kN = 
op “" a ao 
and 
Obie eek 
apeN gq 
Since 


ET) = NA — @), 
E(N — T) = N¢. 
It may also be shown that 
. a ba . Dy, by) . a 
BQ =H Dk-i+ Dy) = y{Ph= al = 2} 
Since E(N — T) = Ng, then asymptotically 


er 
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Substituting the expected values for 7’ and Q and reducing, we obtain 


OL me he q’). 
eae wie 
) ap Py ue) 
Also 
. 
aL kk 
Asean} ~ ms ie 


Thus the asymptotic variance of N is 


Ni — ¢') 


VN) = 5 (15) 
N(—P ya — gh) — Se 
where 
a aie Oo” 
Be woh Pe 
; If the approximation for F’ given in (12) is used, 3 
‘ 
ro ee ee CE 
V(N) eat i qd‘) mite (pk) q'* (16) 


Figure 3 gives graphs of the standard deviations of estimates of N, 
as calcul ‘ulated from equation (15), for populations of size 50, 100, 300 ‘ 
and 500 subjected to specified probabilities of capture for 3, 5, ani 7 
trappings. Values of F’ were taken from tables of the trigamma eee 
- (Davis, 1935). For a given N and k (number of trappings), it may be 

seen that, in general, as p increases the calculated standard deviation . 
of the estimate of N decreases. For N = 50 and N = 100, the aie 


oe Oe irn 
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ASYMPTOTIC STANDARD DEVIATIONS OF MAXIMUM LIKELIHOOD 


ESTIMATES OF N=50,100,300, 500 
(3,5,7 TRAPPINGS) 


N=500 


NO. TRAPPINGS 


Sr. 


STANDARD DEVIATION OF ESTIMATES OF N 


> is RS eet 
PROBABILITY OF CAPTURE DURING A SINGLE TRAPPING 


FIGURE 3 
where 
k = number of trappings, 
T = total catch, 


_ 9 log N! _ & log (WG)! 
an” Ae ee 
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and p and @ are the maximum likelihood estimates of p and g. The 
variance of an estimate of p may be determined by 


Sy ol sae A162 ee 
") = (ryan — py (18) 
Example: — 


Referring to the previous example, 7’ = 320, k = 3, N = 400. 

From Figure 1A, R of .65 corresponds to # = .42; hence ¢ = .58. 

F may be obtained from tables of the trigamma function (Davis, 
1935) by evaluating 


_ & log N1 __ & log (NG)! 


F i= \ 
an” oN” 


or, using the large sample approximation of equation (12) 


Applying either of the above procedures we find 
 ~F=—010. 


Hence, applying (17), Gre: 
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Regression Method of Estimating Population Size from Removal Data 


Hayne (1949) described a regression method of estimating population 
size which is based on the same assumptions as the maximum likelihood 
method. 

If p is the probability of capture during a single period of trapping, 
the number expected to be captured in the first period is Np; the number 
expected to be captured in the second period is (V — y,)p, where y;, is the 
number caught in the first period. Similarly, the number expected to be 
caught in the 7th period is 


where N is the original population size and x; represents the number of 
animals caught prior to the 7th period. This equation may be written 
in the form 


EYy;) = Np — pz; . (19) 


This is the equation of the regression of the number caught in the zth 
period on the number caught prior to the 7th period. The absolute value 
of the slope of the line estimates the probability of capture and the x 
intercept estimates the population size. 

There are several ways in which the regression of y; on x; may be 
fitted. The simplest way, suggested by Hayne, is to fit a line by eye to 
the plotted data. This method is demonstrated in Figure 4, using the 
data of the previous example (successive catches of 165, 101, 54). Other 
ways are to calculate an unweighted least squares regression or a weighted 
least squares regression. Since the binomial variance of the catch in the 
ith period is (NV — x;)pq, a weighted regression might assign weights to 
this catch inversely proportional to (VW — «,), using the intercept of a line 
drawn by eye as a first approximation to NV. 

It is easy to verify that the regression and maximum likelihood 
methods give identical estimates for data collected on two trappings. 
For three or more trappings the estimates by the two methods need not 
be the same. The asymptotic variance of an estimate obtained by an 
unweighted least squares procedure, using data from three or more 
trappings, is not easily determined. Estimates from a weighted least 


~~ squares regression may be seen to be related to estimates obtained by 


the method of minimum chi-square. Minimum chi-square estimates 
are those values of NV and # which minimize the expression 


4 (y: — W = xi)p)” 
x (V — 4) pq 


ANIMAL POPULATIONS 175 


180 ESTIMATION OF POPULATION SIZE 
BY REGRESSION METHOD 


y, =NUMBER CAUGHT DURING it! TRAPPING 


10) 90 180 270 360 450 
x;= PREVIOUS TOTAL CATCH 


FIGURE 4 


Here the weight of each point is inversely proportional to its binomial 
variance. Since the method of minimum chi-square is asymptotically 
the same as the method of maximum likelihood, it appears that the ~ 
variance of estimates obtained from a weighted least squares regression is 
asymptotically the same as the variance of the maximum likelihood 
estimates. 


Sampling Experiments 

Sampling experiments were carried out in order to determine how well 
the large sample theory holds at various population levels as well-as to 
compare the maximum likelihood and regression methods of éstimation. 

Population estimates were made from ‘‘catches” which were drawn 
at random with the aid of a table of random numbers (Snedecor, 1946) 
and Tables of the Binomial Probability Distribution (1950). The 
method of sampling produced ‘‘catches” which varied binomially about 
the expected “catches”. “Populations” of sizes 49, 98, 196, and 392 
were subjected to a probability of capture of .4 for three “‘trappings”’. 

In the case of three trappings, it may be shown that the maximum 
likelihood method does not give a finite estimate of N when the third 
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observed catch equals or exceeds the first catch.* This happened in 
four of the 328 samples drawn from N = 49. Estimates of p and N 
were made for the 324 remaining samples by the maximum likelihood 
and unweighted least squares regression methods. The frequency 
distributions of the estimates are shown in Figure 5 and are seen to be 


MAXIMUM LIKELIHOOD ESTIMATES 
OF N=49, 98, 196, 392 


100 50 
N=49 N=98 
60 coat ese 
— MAX/MUM 
LIKELIHOOD 
x ESTIMATES 
© 20 10 
< : 
 S 30 70 110 150 60 95 130 165 200 
Se 
wy 50 30 
« N=196 N=392 
30 20 
10 
10 
150 190 230 270 350 365 400 435 470 
N 
FIGURE 5 


skewed to the right. Seven maximum likelihood and five regression 
estimates fell outside the range of the graph. The distributions of the 
logarithms and the reciprocals of the estimates also show considerable 
skewness. 

Table 1 gives the mean, standard deviation, and per cent of the 
estimates by each of the two methods which fell within N = 49 plus 
or minus one and two times the theoretical maximum likelihood 
standard deviation, 11.56. 

_ Both observed standard deviations far exceed the theoretical value of 
11.56. Sample 19, with successive catches of 15, 9, and 14, gave a 


*When yi = y3, p = 0 and the estimate of population size is infinite. If ys > y1,theratioR = 
dG = 1)yi/T exceeds unity, which cannot be satisfied (equation (9)) by values of # between 0 and 1. 
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TABLE 1 
Comparison of Two Methods of Estimation 


Regression Maximum Normal 
aor Seated ° : : 

Parameters estimates likelihood curve 

estimates 


Number of estimates 324 324 
Mean 56.68 55.59 
Observed standard deviation 31.95 PAG 
Per cent falling within 49 + 1 
theoretical S.D. 76.9 78.4 68.3 
Per cent falling within 49 + 2 
theoretical S.D.’s 90.1 90.4 95.5 


regression estiniate of 175 and a maximum likelihood estimate of 342 
and is in large part responsible for the difference between the standard 
deviations of the two sets of estimates. 

The last two lines of Table 1 show that, compared to normal theory, 
too large a percentage of the estimates by either method lies within one 
theoretical standard deviation of 49, and too large a percentage lies 
more than two standard deviations away. 

Figure 6, which gives the cumulative percentage of estimates within 
a given absolute deviation of 49, shows little difference between the two 
methods. Of the 324 samples, 172 of the maximum likelihood estimates 
were closer to the true value of 49 than the corresponding regression 
estimate, while the reverse was true for the remaining 152 samples. 

Additional sampling was done for the maximum likelihood method 
with p = 4, k = 3, and N = 98,196, and 392. For N = 98 samples 
were obtained by combining pairs of the 328 samples drawn from 
N = 49. The four samples which had been excluded previously were 
retained in this distribution. Although no case was observed in which 
the first catch did not exceed the third catch, Sample 72, comprised in 
part of one of the rejected samples from N = 49, was the poorest with 
catches of 23, 28, 22 and a maximum likelihood estimate of N= 1108. 

Eighty-two of the samples from N = 196 were obtained by combining 
successive groups of four of the original samples from N = 49. Highty 
additional samples from N = 196 and the entire 160 samples from N = 
392 were drawn by use of random normal deviates (Wold, 1948). 

Frequency distributions of maximum likelihood estimates of N = 
98, 196, and 392 are shown in Figure 5. Table 2 gives the peer of 
these distributions. 


178 BIOMETRICS, JUNE 1956 


CUMULATIVE PERCENTAGE OF 
ESTIMATES WITHIN SPECIFIED 
ABSOLUTE DEVIATION FROM N=49 


100 
80 
60 


40F —-- REGRESSION ESTIMATES 


—— MAX/IMUM LIKELIHOOD 
ESTIMATES 


20 


PERCENTAGE OF ESTIMATES 


fe) 20 40 60 #80 = 100 
|N-49| 


FIGURE 6 


As expected, agreement between the observed and theoretical results 
improves as N increases. The means of the maximum likelihood esti- 
- mates slightly overestimate the true N. Also the maximum likelihood 
standard deviation appears to represent an upper bound to the precision 
which can be attained, even if the assumptions underlying the trapping 
procedure are realized in practice. 

Regression estimates of N = 196 were made for the 162 samples 
and agree closely with the maximum likelihood estimates. The mean 
and standard deviation of the regression estimates are 199.4 and 23.5, 
respectively. =v 
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TABLE 2 


Maximum Likelihood Estimates 


Parameters N = 98 N = 196 N = 392 

Number of estimates 164 163* 162 160 
Mean 109.4 103.3 199.9 395.4 
Observed S.D. | 81.5 22.4 24.0 29.3 
Theoretical S.D. 14.5 19.5 26.9 
Per cent within N + 1 theor. 

S.D. 73 75 70 
Per cent within N + 2 theor. 

8.D.’s 92 93 94 


*Sample 72 omitted 


Summary of Sampling Results 


The sampling experiments in this section demonstrate moderately 
close agreement between maximum likelihood and unweighted least 
squares regression estimates of population size made from three trap- 
pings. With the aid of Figures 1 and 2, the maximum likelihood esti- 
mates may be obtained more rapidly than the least squares regression 
estimates. 

Comparisons were made between the standard errors of the sampling 
distributions of N, at various levels of N, and the corresponding asymp- 
totic maximum likelihood standard errors. These comparisons show 
that for three trappings and a probability of capture of about .4, reason- 
able conformance between the asymptotic and observed standard errors 
takes place for N of about 200 or more. Underestimation of the true 
standard error occurs when the asymptotic formula is applied to data 
from smaller populations. For N of less than 200 an interval of two 
asymptotic standard errors (formula given and applied in example 
earlier in this section) about NV may be considered to represent approxi- 
mately a 90% confidence interval for N rather than a 95% confidence 


interval. 


Ill. EFFECT OF DIFFERENT TRAPPING SCHEDULES ON THE PRECISION 
OF MAXIMUM LIKELIHOOD ESTIMATES 


The biologist is interested in the proportion of the total population 
that he must trap in order to be fairly certain that his population esti- 
mate lies within some specified per cent of the true value. Also should 
he trap intensively over a short period of time or apply less effort over 
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a greater number of trapping intervals? In studying the latter question, 
let us assume that the trapping region is fixed and that the home ranges 
of the animals overlap within this trapping area. Thus the number of 
animals exposed to capture during the first trapping is the same regard- 
less of the number of traps set. 

Let po represent the probability that an animal will be captured 
by a given trap during a single period of trapping, and let gq = 1 — po. 
Assuming 7 to be the same for each trap and also assuming that each 
trap operates independently of all other traps, then the probability 
that an animal escapes trapping by any of ¢ traps in a single period 
is go = g. Also g* = qo’ is the probability of not being caught in any 
of k periods. Consequently the proportion of the population which is 
expected to be captured throughout the trapping program is theoretically 
a function only of q and of the total number, tk, of trap-nights. 

We may now see how the variance of the estimates of N is affected 
by altering the trapping schedule, i.e., the values of ¢ and k, when the 
value of go and the total expected capture N(1 — gq‘) are kept constant. 
Equation (15) may be written in the following form: 


“Al 
Su aon) a 
Nal — q’) 


VW) = (20) 
(Fo 


where 


iis wo _ @logN! a log Nq'! 
ON” oN” oN” 

For a given total expected capture, the quantity Ng* remains con- 
stant, and consequently F’ remains constant. Thus the effect on 
V(N) of different schedules will be due to the changes in the value of 
(kp)*/q as k and p vary but (1 — q*) remains fixed. 

Table 3 shows the standard deviation of N when N equals 200, 300, 
500 and 1000,.and when given proportions of the population are expected 
to be captured over varying numbers of trappings. The standard 
deviation decreases only slightly as the number of trapping intervals 
increases beyond three. For practical purposes the standard deviation 


can be considered essentially constant for a given N and a given total 


proportion captured. Therefore, under the assumption that infiltration 
~of animals from nearby areas is of negligible importance, the trapper 
would choose that schedule which would minimize his cost. In some 
circumstances he might find it more economical to set many traps over 
a short period; in others, a few traps for a considerable length of time. 
If infiltration were expected to disrupt the assumptions of the method 
of estimation, a short intensive program would be indicated. 
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Discussion of Precision 

‘ With the preceding results in mind, i.e., that the precision of the 
estimates depends primarily upon the expected proportion of the total 
population captured, we may now study the removal method to see 
approximately what proportion should be captured over an entire 
trapping program in order to reduce the standard deviation to a specified 
per cent of the population size. Table 4 gives these values for population 


TABLE 4 


Proportion of Total Population Required to be Trapped for Specified Coefficient 
of Variation of NV 


Coefficient of Variation 
N 30% 20% 10% 5% 
Proportion (to nearest .05) of population to 
be captured (in 100 or fewer trappings) 
200 noo 60 sie .90 
300 50 .60 49 .85 
500 45 TOO! .70 .80 
1,000 .40 45 60 Shes 
10,000 20 25 BE .50 
100,000 .10 aS .20 .30 


sizes from 200 to 100,000 and for coefficients of variation of 30, 20, 10, 
and 5 per cent. This table shows that relatively large proportions of 
the population must be trapped in order to obtain precise estimates. 
It must also be remembered that the variances used in these calculations 
are at best valid only when the assumptions of a constant binomial 
probability of capture and a stationary population are satisfied. Since 
these assumptions are unlikely to be completely realized under actual 
trapping conditions, we should consider the variances obtained above 
as a minimum that might be reached under ideal conditions. The poor 
precision of the removal method raises a serious question as to its 
utility to the biologist. 
: Changes in population density that are large may be detected by 
the removal method. Let us assume that at times 7’, and 7, the popula- 
tion is of sizes N and rN respectively, i.e. r is the factor by which the 
population size has been changed during the interval between 7’, and T, . 
Assume that at time 7, , the intensity of trapping (i.e. the number of 
traps set) is such that the coefficient of variation of the estimate of 
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population size is some value c. If the same number of traps are set at 
time 7’, , and are sufficiently numerous so that the possible increase in 
the population size would not affect the assumption of independent 
action of each trap, then theoretically the probability of capture during 
a trapping at time 7’, is the same as at time 7,. From equation (16), it 
is seen that asymptotically for given values of p and k, the variance of an 
estimate of population size is proportional to the population size; hence 
the coefficient of variation of the population estimate at time T) is c/ ‘Vr. 
Values of r, the amount of change in population size, which have a 
specified probability of being detected on the basis of trapping results 
at times 7’, and T, , can therefore be determined. The following table 
gives the size of J aed change that has an eighty per cent chance 
of being detected (at the 5% level of significance) from trapping results, 
in which the trapping effort in the two trials is constant and is of sufficient 
intensity so that the coefficient of variation of the first estimate is a 
specified per cent of the population size at that time: 


Downward 
Coefficient of Upward Change Change 
Variation (Per Cent) (Per Cent) 
oa b 44 36 
2 96 65 
a 159 89 
A SB _— 


IV. VALIDITY OF ASSUMPTIONS 


In addition to knowing the theoretical properties of the removal 
method, the degree to which the assumptions underlying the method 
are met in practice will help determine its usefulness as a tool for obtain- 
ing estimates of population size. : 

If fairly intensive trapping is employed over a short period of time, 
the assumption of a stationary population may be considered reasonable 
in many situations. The removal method also assumes that the prob- 
ability of capture during a trapping is the same for all animals, and 
that this probability does not change from trapping to trapping. Evi- 
dence of trap-proneness and shyness is reported in the literature (Chitty 
and Kempson, 1952; Tanaka, 1951; Young et al, 1952); hence it would be 
valuable to have a quantitative measure of the amount of variation in 
catchability within an animal population. It should be pointed out 
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that any population estimate based on trapping results applies only 
to the catchable population and therefore in situations where for 
example the young remain relatively inactive and not subject to capture, 
the adult population alone will be estimated. 

In addition to individual differences during a given trapping it 
would be important to know the changes in catchability that occur from 
trapping to trapping. Weather changes for example will influence 
amount of activity, which, in turn, will affect the probability of capture 
(Brown, 1954; Calhoun, 1945). Also, how much of a role does learning 
play as a factor in the probability of capture? LEcologists may help 
answer questions such as these in the future. 

The removal method has been used in the North American Census 
of Small Mammals (Calhoun, 1948-1951). Using the data from these 
censuses, two different chi-square tests were employed in order to study 
the assumption that the probability of capture remains constant 
throughout the trapping program. 


Chi-Square Tests 


In Section II, it is shown that a multinomial approach may be used 
as an alternative to the conditional binomial approach in obtaining 
maximum likelihood estimates of N, the population size, and p, the 
binomial probability of capture assumed constant throughout the trap- 
ping program. The multinomial approach assumes that the observed 
catches follow a multinomial distribution with the probability of being 
captured during the ith trapping equal to pg’ *. Consequently the 
probability that an animal will not be captured during the entire pro- 
gram of k trappings is g*. If this hypothesis is satisfied and N and p 
are known then, asymptotically, the quantity 


-> (One Geet Olt ees T Nq)* 

q . 

will be distributed as sf, with k degrees of freedom. In expres- 
sion (21) y; represents the 7th observed catch and 7 equals the total 
of the individual catches. Hence to test the hypothesis that the prob- 
ability of capture has a constant value p, Xi may be calculated for a 
given set of data and compared with the critical tabular value. The 


(21) 


—_ asymptotic properties of this multinomial or unconditional chi-square 


test, including its limiting power, are known (see e.g. Cochran (1952)). 

If either or both of the parameters N and p are estimated from a 
sample, then expression (21) with the estimates substituted is also 
distributed as chi-square with decrease in degrees of freedom correspond- 
ing to the number of parameters estimated. When both N and p are 
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estimated and 7/(1 — @°) is substituted for NV (equation 8), then the 
form of the chi-square test becomes 


with (& — 2) degrees of freedom. This expression, which compares the 
7th observed catch with the proportion of the total catch expected during 
the 7th trapping, is independent of N and therefore is a test of the 
assumption that the proportion of free animals expected to be captured 
during each trapping is constant. 

A conditional test of a constant binomial probability of capture 
suggests itself if consideration is given to x; , the total number captured 
prior to the zth trapping. By analogy with a chi-square test for the 
sum of a number.of independent binomial variables, the conditional 
binomial test is of the form 


i=1 (N— =i) pq oe 

The properties of this test, such as its power function, have not been 

thoroughly studied. However it may be shown that when the assump- 

tion of a constant binomial probability of capture p is satisfied, expres- 

sions (21) and (22) yield asymptotically identical results for a given set of 
data. To do this let us define the zth catch as 


= (N — 2)p t+e,WN — a)pq (23) 


where e,; is a normal random variable with mean 0 and variance 1, and x; 
represents the total capture before the 7th trapping. 

Substituting the right-hand side of (23) for y; in expression (21) we 
find, after considerable algebra, that 


y : 
= >> ¢; + terms of order N-”. 
i= 
The same procedure applied to expression (22) yields 
k 
= dé. 

i=1 P 

Thus, asymptotically, for a given set of data the multinomial and ~ 
the conditional binomial tests for the constancy of p provide numerically 


identical results. Since the multinomial test is known to follow a chi- 
square distribution, this result shows that the conditional binomial test 


ta 
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also follows a chi-square distribution with the same number of degrees 
of freedom. It may further be assumed that the conditional binomial 
test using estimates of N and p substituted for the parameters will also 
be distributed as chi-square with (k — 2) degrees of freedom. 

It should be pointed out that the conditional binomial test is not 
expected to detect changes in N, since it may be shown that a linear 
regression of catch, y; , on previous catch, x; , is expected when the 
exposed population changes by a constant factor between trappings 
and » remains constant. 

It would be of considerable.interest to determine the relative powers 
of the conditional and unconditional tests of what is essentially the same 
hypothesis. This problem was not taken up actively, however, since it 
represents a digression from the main purpose of this paper. 


Application of Chi-Square Tests to Trapping Data 


In the North American Census of Small Mammals effort was made 
to sample rodent populations of the genera Microtus, Sigmodon, 
Peromyscus, Synoptomys, and Dicrostonyx. Trapping data were 
collected in many locations throughout the United States and Canada. 

Conditional and unconditional chi-square tests of the assumption 
of a constant probability of capture were performed on 149 sets of these 
trapping data. Each set of data represented the catches in a given 
area on three successive nights of trapping, a constant number of traps 
being set each night. The data presented here include only those cases 
where the total of three catches in each set equals or exceeds ten animals, 
and also where the first night’s catch exceeds the final catch in order 
that maximum likelihood estimates of N and p might be made. Fifty- 
five other sets, in which the second condition was not met, were excluded 
from the calculations. The catches were not separated according to 
genera since this would have resulted in groups of catches too small to 
analyze. Therefore the hypothesis tested is in effect that the probability 
of capture is the same for all the genera and remains constant over the 
entire trapping program. The results of the chi-square calculations are 
given below: 


Result Conditional Unconditional 
chi-squares chi-squares 
Significant at 5% level 9.4% 7.4% 
Total of 149 chi-squares 192.20 187.57 


P of total chi-square with 149 d. f. .009 .016 
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Thus both total chi-squares show significance at the 2% level of 
significance. 

A trend in the direction of the deviations of observed from expected 
catches was studied as a possible contributor to the significant total 
chi-squares. A comparison between the observed third catch and the 
expected third catch, as predicted from the first two observations, 
provides the following results: 


CER a E(ys) 


| | 
Number Positive | Zero Negative Total 


No. of samples | 73 | 5 cl 149 


Although these 149 samples are selected (y, required to exceed y;), 
consistently positive differences would be expected if major infiltration 
during the trapping programs were a general phenomenon. Conversely, 
if the probability of capture were decreasing throughout the trapping 
program, consistently negative differences might be observed. The 
above results do not suggest any such trend in the direction of the 
differences. Consequently, the significant chi-squares appear to repre- 
sent unsystematic deviations that are on the whole larger than simple 
binomial variability would predict. This may be related to differences 
among the species comprising the catches or this added variability 
might be due to any of a number of additional factors (climate, food 
supply, etc.) which the theory assumes to remain constant throughout 
the trapping program, but which may change in practice. Some varia- 
tions in trapping effort regularly occur due to variations in the number of 
sprung but unfilled traps observed throughout a trapping program. 

The formulas for the standard deviation of N given earlier in this 
paper assume that the size of each catch is subject only to binomial 
variability. The data studied above suggest that additional variability 
is present, which will be underestimated by the theoretical-standard 
deviations. As an approximation to the true standard deviation, one 
might adjust the theoretical value by the factor 7 


4 total chi-square 
degrees of freedom 


which is about 1.2 for the trapping data studied. This kind of rough 
adjustment for variability in excess of purely binomial variability has 
been employed in the field of bioassay (Bliss, 1952). 


188 BIOMETRICS, JUNE 1956 
V. SUMMARY 


The theory of the maximum likelihood method of estimation of 
population size has been reviewed and developed for application to data 
collected in a trapping program employing constant effort and removal of 
captured animals from the population. The assumptions underlying this 
method are that the population is stationary, that the probability of 
capture during a given trapping is the same for all animals exposed to 
capture, and that this probability of capture does not change from 
trapping to trapping. 

A rapid graphical method for obtaining maximum likelihood esti- 
mates of N (population size) and p (probability of capture during a single 
trapping) has been presented as well as formulas for the standard 
errors of these estimates. 

The asymptotic precision of the removal method of estimation has 
been determined for specified levels of probability of capture and for 
given numbers of trappings. A study was made of the effect on the 
asymptotic variance of maintaining constant the total number of units 
of trapping effort, but varying the trapping schedule. It was seen 
(Table 3) that only a slight gain in precision is expected from increasing 
the number of trappings with fewer traps set per trapping. The propor- 
tion of the population that must be captured for a specified coefficient of 
variation has been calculated. The precision of the method was shown 
to be poor, e.g. for NV of 1000 or less, at least 40% of the total population 
must be trapped in order to obtain a coefficient of variation of 30% of less. 

Experimental sampling was carried out in order to compare the maxi- 
mum likelihood and regression methods of estimation and to compare the 
distributions of observed estimates of N with the asymptotic distribu- 
tions. The distributions of estimates obtained by the two methods were 
very similar. For small N’s the observed distributions were skewed in 
the direction of overestimating NV, and the estimated variances exceeded 
the asymptotic values. With a probability of capture of .4 during each 
of three “trappings”, fairly close approximation to normality and the 
asymptotic variance was attained by the distributions of estimates of 
N = 200 and larger. 

Chi-square tests of the assumption of a constant binomial probability 
of capture were performed on trapping data collected in the North 
American Census of Small Mammals. Both conditional and uncon- 
ditional chi-squares were significant at the two per cent level. A study 
of the direction of the differences between observed and expected catches 
revealed no trend, suggesting that the significant chi-squares might have 
resulted from larger unsystematic differences between observed and ex- 
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pected than would be explained by simple binomial variability. Differ- 
ences in catchability among the animals and/or other factors, assumed 
constant by the theory, might explain the additional variability observed. 
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THE CONCEPT OF PATH COEFFICIENT AND ITS 
IMPACT ON POPULATION GENETICS* 


CrGe Ex 
Graduate School of Public Health, University of Pittsburgh 


INTRODUCTION 


The method of path coefficients was first published by Professor 
Sewall Wright thirty-five years ago. In 1921 there appeared in the 
Journal of Agricultural Research (Vol. 20) a general account of the method 
and the relationship between correlation and path coefficients, together 
with some examples of application; and in Genetics (Vol. 6) a series of 
five papers dealing exclusively with the application of path coefficients 
to genetic problems. Previously known results of various mating 
systems, obtained by laborious arithmetical procedures, were confirmed 
by the more elegant method, and many new results were reached, some 
of which were later corroborated by the method of matrix algebra while 
others are still difficult to obtain by any other method today. These 
classical papers, together with the pioneer work of Fisher (1918), still 
constitute the basic readings for students of population genetics, although 
the method has since taken a more sophisticated form and the field of 
application has been widened. However, one must admit that the 
method of path coefficients, as powerful and flexible as it is, was not 
immediately very popular among geneticists, still less so among pro- 
fessional statisticians. It was much later that its usefulness became 
gradually and generally appreciated. 

Path coefficients can be treated at various mathematical levels. 
The most important properties, however, can be deduced and studied 
by standard statistical tools. To understand the method requires little 
more than a knowledge of multiple and partial regression and correlation. 

~ It is a special type of multi-variate analysis—a method of dealing with a 
“closed” system of variables that are linearly related. (For non- 
linearly related variables, an appropriate transformation of scales may 


*Presented at the A.I.B.S. symposium on Sewall Wright’s contributions to Population Genetics, 
sponsored by the Genetics Society of America, the American Society of Human Genetics, and the 
Biometric Society (ENAR), and held in Hast Lansing, Michigan, September 6, 1955. 
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be necessary for its applicability.) By a closed linear system is meant 
that each variable in the system is either a linear combination of some 
other variables in the system or is one of the basic factors, which may 
be correlated with or independent of other basic factors in the system. 
In other words, the system is formally complete, including all the basic 
factors (the “causes’’) and their resultant variables (the “effects”). 

On account of the nature of a closed system, the practical employ- 
ment of the method of path coefficients is greatly facilitated by the 
formulation of a diagram showing the interrelationships of the variables 
concerned. In constructing such a diagram the convention is to use a 
double-headed arrow to indicate correlations between basic factors and 
a uni-directional arrow to signify the direct path of influence from one 
variable to another. The diagram, or causal network, must be consistent 
with a particular viewpoint and contain no duplicate or superfluous 
paths of connections. It is important to realize that various different 
viewpoints may be taken in regard to a system of interrelated variables, 
and, consequently, the causal diagram may be constructed in various 
ways. At least this is so for a purely mathematical setup. The choice 
of a particular viewpoint is up to the investigator, but, once it is taken, 
it must be held consistently in the entire diagram. 

At this point it is well to clarify a possible misapprehension of the path 
method. It is not an endeavor to infer causal relations from observed 
correlations among a set of interrelated variables. Quite on the con- 
trary, the employment of this method must be preceded by the formu- 
lation of a causal scheme, either based upon an a priort knowledge of 
the causal relations or based upon an hypothesis which the investigator 
chooses to accept or test. Consequently, the more we know of the true 
relations among the variables, the more meaningful will be the results 
of path analysis. In the case of incomplete external knowledge con- 
cerning the causal relations, the interpretations derived from the analysis 
are of course subject to revision in the light of further knowledge and 
revised hypotheses. 


PATH COEFFICIENTS 


A comprehensive mathematical presentation of the method of path 
coefficients was given by Professor Wright himself in 1934. His two 
more recent summaries of the subject are to be found in Annals of 
Eugenics (1951, Appendix) and Statistics and Mathematics in Biology 
(1954, Chap. 2). The only “pure” statistician (known to the writer as 
of now) who made a thorough discussion of the biometric concept and 
limitations of the method, as well as some topics related to path co- 
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efficients, is Prof. John W. Tukey (1954, Chap. 3), whose work should 
be consulted by both statisticians and geneticists. A much more 
elementary exposition of the subject has been outlined by the present 
writer (1955, Chap. 12). It is not proposed to reproduce all the mathe- 
matical details here. A very sketchy account must suffice for the present. 
Some of the more specific points will be discussed along with concrete 
examples in later sections. 

The first premise is that the degree or extent of influence of one 
variable upon another can be expressed in quantitative terms. Then 
it is a matter of devising a numerical measurement of such an influence. 
After the construction of an adequate causal diagram referred to in the 
previous section, the task is to find a device for assigning a value to 
each of the arrows serving as a symbol of influence along that path 
(which is directional). The value so assigned to the path is called a 
“path coefficient’. 

Suppose that a dependent variable Y is linearly determined by two 
factors, X and Z. This is our causal scheme, the simplest of its kind. 
In words, the path coefficient for the arrow from X to Y is defined as 
the portion of the standard deviation of Y that is due to the variation in 
X. The meaning of this verbal statement may be made clear by the 
following consideration. Let the ordinary multiple regression equation 
of Y on X and Z (fitted by method of least squares) be Y = A + 
BX + CZ, or, more conveniently, 


) aN a Ge Oe aN As (1) 


where B is the regression coefficient of Y on X. It is the number of 
units expected to change in Y for each unit change in X, such as the 
change in number of bushels (Y) of grain per acre for each inch of 
rainfall (X) in the growing season, or the number of pounds expected to 
change in weight Y for each inch of height X. Clearly, the value of B 
not only depends upon the degree of influence of X on Y but also depends 
upon the actual physical units employed in measuring the variables. In 
other words, the value of B will change if the Y is expressed in units of 
tons instead of pounds and/or X is expressed in units of centimeters 
instead of inches, although the influence of X on Y remains the same. 
It is therefore desirable to devise a regression equation expressing the 
same relationship but independent of physical units. This may be 
accomplished by using the so-called “standardized” variables. Thus, 
equation (1) may be rewritten 


HY _ Box (X— X) Goa ee) 
Oy NOR gs to ake (2) 


Ox OZ 
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To be sure, the equations (1) and (2) are saying the same thing, except 
that (1) is in terms of actual units and (2) in units of standard deviations. 

It is convenient here to use lower case letters to denote the corre- 
sponding standardized variables. Writing x for (X — X)/cx ete. the 
foregoing equation takes the form 


y = br + cz (27) 
in which the new regression coefficients for the standardized variables are 
Box Co 
b = Prx = eon ae Cc — r — ee ° 3 
2 y YZ 


_ These regression coefficients are also without physical units. The value of 

b = pyx is named the path coefficient for the direct path from X to Y. 
Since a path coefficient is merely a regression coefficient in standardized 
form, it may also be referred to as a ‘‘standardized regression coefficient”’. 
Note that the definition (3) gives a precise meaning of our previous verbal 
definition: the portion of the standard deviation of Y that is due to the 
variation in X. 

The path coefficient, so defined, possesses many properties which 
make it useful in statistical analysis. Being a type of regression co- 
efficient, it is directional (e.g. from X to Y), may be positive or negative, 
and may be greater or less than unity. Being without a physical unit, 
it resembles a correlation coefficient. Indeed, it reduces to an ordinary 
correlation coefficient under certain simple conditions. Some of these 
properties are illustrated in the following examples. 


EXAMPLES OF ANALYSIS 


Perhaps there is no single numerical example which would give all 
the properties of path coefficients. The basic purpose of this section is 
to show the similarity and difference between path analysis and the 
ordinary multiple regression method, and to point out the advantages of 
the former over the latter in interpreting and measuring the causal 
relationships. To begin with, let us consider the two sets of data pre- 
sented in Tables 1 and 2, and assume that Y is a linear combination of 
X and Z in both cases. Fitting a regression equation of type (1) by the 
usual procedure of least squares amounts to the solution for A, B, and 
C of the following set of ‘normal’ equations: 


> Ye AN +B xX EP Cle 7, 
ae e< AEX ALB Xe (4) 
WYZ sADZABD XZ4+00 2 


ll 


BIOMETRICS, JUNE 1956 


194 


atta ia ik 


000g" = 7X4 Ch06) = 744 ozes’ = *4u 
O01 arcs OM Sears O°S Ee 
eee ee yi pe Teer’ = 40 
Se =e oh ae Ween onc 
OL ene a oon. 
| 08 = 2 0ZI = XK 00'°8h = AZ 
81 j iat 91'6 
ST | 780 0¢°8 
ia! sim! 818 
IL II ZI'¢ 
8 41 96'S 
9 “08 Zz'9 
c 6 09°% 
g Or 9%°% 
ZiT ae 4 


Z pue X Jo uoyvurquoy seoury ve se 7 
6 HIAVL 


PATH COEFFICIENT 195 


where N = 8 = number of observations in each variable. The equations 
(4) may be found in almost every textbook of statistics and are repro- 
duced here for reference and later use. Thus, for the data of Table 1 
we find that Y = —1.70 + .30X + .32Z. or 


“) 


YoY = 20% — X) 47 3217 + J). (5) 

Proceeding the same way with the data of Table 2, we find that A = 

—1.7, B = .30, C = .32; that is, the regression equations for the two 

sets of data are exactly the same. Based upon the regression equation 

(5) alone, we might conclude that the influences of X and Z on Y are 

the same for the two cases. However, expressing the equation in the 

standardized form (2’), we have for the data of Tables 1 and 2 respec- 
tively: 

y= 60x + .80z, (pal) 

y = .4932x + .6576z. oe) 


Now we proceed to examine why (5.1) and (5.2) are different and what 
the standarized regression coefficients mean. The chief difference 
between the two sets of data, as may have been noticed, is that X and Z 
are uncorrelated in the first example and correlated in the second. - 
The results of analysis up to this point may be best summarized in the 
form of diagrams. Thus, Figures 1 and 2 represent the causal relations 


Se a sgpe— * 
4A x cece Kz 50 
Eaee 7, 0575 Z 


FIGURE 1 FIGURE 2 
Uncorrelated causes Correlated causes 


embodied in Tables 1 and 2 respectively. The coefficients in (5.1) and 
(5.2) are the path coefficients of the respective arrows in the diagrams. 
The path coefficient is taken as a measure of the degree of influence 
along the path. We note that when the causal factors are uncorrelated 
(Fig. 1), the path coefficient is simply the ordinary correlation coefficient 
between the two variables concerned (as given at the bottom of Table 1). 

Thus, 
Tyx = Pyx = -60 (6.1) 


lyz => Pyz — .80 
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When the causal factors are correlated (Fig. 2), the following relations 
hold: 


.8220 


l| 


Uiae = Pyx -+- TxzapPyz => 4932 a (.5).6576 


Wa, SS VOe: = TxzPyx = .6576 + (.5).4932 


(6.2) 
9042 


The meaning of the relations (6.2) becomes immediately clear upon 
examining the corresponding diagram. For instance, the first expression 
shows that the correlation between X and Y is the sum of the values of 
the two paths connecting them. In other words, the total correlation 
ryx = .8220 has been separated into two components: the component 
pyx = .4932 measures the influence of the direct path from X to Y, 
and the other component rxzpyz = (.5) (.6576) = .8288 measures the 
influence of the indirect path from X to Y, via Z. The route X¥ @~ Z— Y 
is also known as a ‘compound path’’, whose value is the product of 
those of the two individual steps. The separation of a correlation co- 
efficient into various components is one of the chief accomplishments of 
the method of path coefficients. Analogous to “the analysis of variance’’, 
the path method may be called ‘“‘the analysis of correlation’. 

Since the relation (6.1) is a special case of (6.2), we need only to 
demonstrate the truth of the latter. On transforming the actual values 
Y, X, Z, into the standardized values y, x, z, we obtain the pleasingly 
simple results: ).y = Dox = Doz = 0, o, = o2 = o2 = 1, and 


Gov Gee) =. = trex; ete. 


From the last expression, we see that a correlation coefficient may be 
called a “standardized covariance”. To fit a regression equation of y 
on x and 2, we need only substitute these simplifying values in normal 
equations (4) in which we use 6 and c to stand for the path coefficients. 
The first equation yields a = 0. The remaining two equations reduce to: 


Tyx = b+ crxz (6) 


Tyx = bryz +c 


which is our (6.2) with b = pyx and¢ = pyz. Whenryz = 0, we obtain 
_ (6.1). The solution for 6 and ¢ from equations (6) will of course yield 
the standardized regression equations (5.1) or (5.2). 

There remains one more point to be mentioned. If Y is completely 
(without “residual error’) determined by X and Z according to relation 
(1), then 


oy = Bog + C'o7 + 2BCoxontrz (7) 


py 
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Dividing by the variance of Y throughout, 
1= (Bex) a (Ges) =f. o( Bex)(Cee),. 
Oy Oy Oy Oy 
or 
] = Dyx Ss Drz <i 2DyxPyzlxz : (8) 


The values of Y in our numerical examples are completely determined by 
X and Z. Hence, for Figures 1 and 2, we have respectively 


(.60)” + (.80)? = 1, 
(.4932)* + (.6576)” + 2(.4932)(.6576)(.50) 


(8”) 
I, 


The terms of (8) are called the ‘coefficients of determination’’. Thus, 
for Figure 1, we say that (.60)” = 36% of Y’s variation is determined 
by the variation in X; this conforms with the usual statement that c+ 
will be reduced to ¢#(1 — r+x) when X is “held constant” in the ordinary 
linear regression analysis of Y on X. In Figure 2 we say that X directly 
determines (.4932)” = 24.32% of Y’s variation while the joint effect of 
X and Z determines 2(.4932) (.6576)(.50) = 32.48% of Y’s variation. 
When the causal factors are positively correlated, the expression (8) 
for determination shows that the higher the correlation, the larger the 
joint effect of X and Z, and consequently the smaller the direct effects 
of X and Z separately. When rxz is negative, there is a semantic 
difficulty in speaking of a negative percentage. The meaning is never- 
theless clear: the negative correlation diminishes the variance of Y, 
making it smaller than it would be if X and Z were uncorrelated. 

General formulas given by Wright (1955), which are of the type of our 
(6), may be obtained immediately by writing down the standardized 
form of “normal” equations of type (4). 


THE CORRELATION COMPONENTS 


The most direct application of the path method is the deduction of 
the correlation between two variables (say X and Y), which are linear 
functions of some common variables. For example, suppose 


X =B,+ BZ, + BZ, + BZ, + Ei 
Y =(.4+ GZ, + 0.22 + CsZ3 + H,. — 
Upon standardization, the two linear expressions above become 
x = bz, + deze + bsg + pies 
Y = C21 + Cota + Cazg + Drea - 
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Now assume that the two E’s are neither correlated with each other nor 
with any of the Z’s. Then the correlation between X and Y is 
try = cov (z, y) = &(bi21 + bee + ses) (Ci2i-+-Co%e + C323) 
= biG, > Oifia0o Dyri3Cs (9) 
+. Dols + DafaiCi + Desf 2aC3 
+ bss + Dsrgici + OsPs2€e - 


Translating the expression into a diagrammatic form (Fig. 3), we see 
that ryy is the sum of the nine compound paths by which X and Y are 


FIGURE 3 
The correlation between X and Y due to common causes 71, Z2, 73. 


connected. In the language of path analysis, we say that the three 
Z’s are the “common causes” of X and Y. The latter are correlated 
because of the presence of the common influences of the Z’s. Some of 
the terms of (9) may drop out if any two of the common causes are 
uncorrelated between themselves. In particular, when all of the common 
causes are ‘‘independent”’ of each other, we simply have 


fry = bye + dot + bec, . 


Writing px, for b, to denote the path coefficient from the common 
cause Z, to X, etc., the general form of (9) may be written 


oe = Dy PxiPyi + = Pil iiPyi (10) 
a Aq 
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where 7 = a common cause. This is a fundamental theorem in the 
theory of path coefficients. In words, briefly, the correlation between 
two variables is the sum of all the paths connecting them. 

When this same principle is applied to genetics, we will obtain the 
correlation between relatives, for the true or “blood” relatives are simply 
those sharing one or more common ancestors, who constitute the common 
“causes’’ for the heredities of the relatives. This is the reason why the 
method of path coefficient has been so successful not only in calculating 
the genetic correlations but also in breaking it up into components, so 
that each component represents the contribution by a certain common 
ancestor through a particular line of descent. 


MENDELIAN VARIABLES 


It is fully realized that the above sketchy account does not do full 
justice to the method of path coefficients but it must suffice; and now we 
turn to its applications in population genetics. In the course of con- 
sidering genetic problems, some further properties may be elucidated. 
In the following discussion we shall ignore linkage, sex-linked, and 
polysomic inheritance, as well as environmental effects, and concentrate 
on the autosomal genes of bisexual diploid organisms. 

Consider a metrical character that is entirely controlled by a certain 
number of genes. We shall first assume that each gene has a specified 
amount of effect upon the trait, without dominance or interaction 
(epitasis) between loci. If the contributions of AA, Aa, aa to the trait 
are 2a, a, 0; and those of BB, Bb, bb are 28, B, 0, respectively, etc. each 
independent of other loci, then the value of a whole genotype is the sum 
of the effects of individual loci. Briefly, all genic effects are additive. 
We shall refer the value of the metrical character of the whole genotype 
as the ‘‘genotypical value” (of an individual). 

One has undoubtedly heard the statement something like this: a 
child is a half-and-half of the parents. Thus, we talk about one’s being 
half-Irish and half-German. In this sense the statement is understand- 
able. From the observation that a child often shows a feature intermedi- 
ate between those of his parents, the statement taken descriptively is 
acceptable. With respect to the fact that a child receives an equal 
number of chromosomes from each of his parents, the statement is even 
scientifically accurate. Despite all these, when we talk about a metrical 
trait, it may be surprising to many that the above statement is simply — 
not true, or at least not very precise. The height, for instance (assuming 
height to be entirely controlled by additive genic effects), of an adult 
offspring is not necessarily equal to 3 father’s and 3 mother’s heights, 
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except in very special cases. Were every adult offspring to assume the 
mean height of his parents, then in a few generations all adults of a 
population would be of the same height! 

Then, what is the relationship between an offspring’s and his parents’ 
measurements? There are several statistical approaches to this problem; 
the most novel one is that employed by Wright. His method rests upon 
the introduction of gametic variables and the subsequent analysis by path 
coefficients. If we imagine that a (haploid) gamete assumes a value 
corresponding to its genic content, then the genotypical value of an 
individual is the sum of the values of the two gametes that united to 
produce the individual. Hence, we obtain the fundamental causal 
scheme: 


Yy=nt+h, (11) 


where y is the genotypical value of an offspring and g, is the value of the 
gamete contributed by the father and g, that of the mother. The state- 
ment (11) is an exact one; it is not the same as saying that a child attains 
the mean value of his parents. Assuming additive genic effects and 
autosomal inheritance, y is completely, linearly and equally determined 
by g, and g,. Furthermore, we shall treat g, and g, and y as standardized 
variables. Then we see that (11) is a regression equation of the type 
(5.2) or (5.1), depending whether g, and g, are correlated or not. The 
path coefficients from the two g’s to y must be of the same magnitude, 
since the two gametes have equal influence on y. Following Wright’s 
original notation, let a denote the path coefficient from a uniting gamete 
to the resultant zygote. The theorem (8) on complete determination 
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if 


yields in this case: (Figs. 4 and 5, Left): 
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for uncorrelated g’s a’ +a’ =1, a= v3 


rarer 12) 
2(1 + F) 


where F is the correlation between the two g’s. Note that for uncorre- 
lated causes, a = V3 is also the correlation coefficient between y and 
one of the g’s. 

The truth of (12), like all others to be presented in this section, is 
independent of the trait under consideration, independent of the units 
of measurement, independent of the number of loci involved, and in- 
dependent of the gene frequencies of the population. 

Next, let us consider the path coefficient, b, from an individual to a 
gamete produced by him (Figs. 4 and 5, Right). The genic content of 
a gamete is of course limited by that of the individual, but within that 
limitation it is subject to chance of segregation in each meiosis. The 
correlation between an individual y and a gamete g produced by him is 
the same as the correlation between y and a gamete g’ of the previous 
generation that produced the individual. Hence (for details see Li, 
1955, p. 174), 


for correlated g’s 2a° + 20°F = 1, q = ‘i 


a= Vi 


> 
I 


for random individual, 


ee 


2 


for an inbred individual, b= ni) + FO = 
where the primes indicate the previous generation. 

It will be noticed that both a and 6 are expressed in terms of the 
correlation between uniting gametes. The statistic / plays a cardinal 
role in the path analysis of genetical problems of this sort. It is also 
known as the inbreeding coefficient of an individual (whose parental 
gametes have a correlation Ff’). Of course, F cannot be directly observed 
since the gametes possess no metrical value in the real physical sense. 
It is purely conceptual, or, shall we say, it is a mathematical device to 
facilitate the analysis. This brings out an important feature of the 
general method of constructing causal diagrams and the subsequent path 
analysis: a variable pertinent to the causal scheme should be included 
in the diagram whether it is observable or not. 

The correlation between the two uniting gametes is due to the corre- 
lation between the two parents which actually could be observed and 
measured. In Fig. 4 the two parents are uncorrelated. In Fig. 5 the 
correlation between the parents (also called “‘mates’’) is m. Hence: 


Rae shipiie bone | (14) 
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Without spelling out all the details except to reiterate that a compound 
path value is equal to the product of single path values and that the 
correlation between two variables is the sum of all paths connecting 
them, we obtain the following results by examining Figs. 4, 5, 6: 


Correlation Panmixia Inbreeding 
Parent-parent: | Trpp = O, pees = UD) 
Parent-offspring:| 7p9 = /i Vt = 4, reo = ab(1 + m) (15) 
Sib-sib: foo =v 3) (V9)* = 4, lroo = 20°b(1 + m) 


We started out with the simple causal scheme (11) between gametes 
and zygotes, and subsequently obtained four formulas, (12)-(15), 
which can be considered the four corner stones of population genetics. 
It should be clear, however, that all of these relations are consequences 
of the Mendelian mechanism in heredity, only expressed in a very 
concise manner. When every gene has a qualitative effect, the results 
are the various “ratios’’; when every gene has a quantitative effect, the 
results are the various correlations. Therefore, the set of parental, 
gametic, and offspring’s metrical measurements may well be called 
“Mendelian variables’. A Mendelian variable is a random variable 
conditioned by Mendelism. Although the method of path coefficients 
is applicable to a wide range of problems, it is particularly suitable for 
the analysis of Mendelian variables that have linear additive properties, 
owing to the clearcut “cause-and-effect”? relationship in heredity. 


SYSTEMS OF INBREEDING 


The subject of continued inbreeding of a Mendelian population has 
been a field of active research since World War I. In this section we 
shall not study the effects of inbreeding as such, but rather cite one or 
two examples to illustrate how the results of continued inbreeding may 
be obtained by using path coefficients in a way that is by far more 
economical than any other known method. 

Briefly, to obtain the results of continued inbreeding merely involves 
repeated application of the four fundamental formulas (12)-(15) estab- 
lished in the previous section. The gist of the whole trick is this: when 
a system (or pattern) of inbreeding is specified, there corresponds a 
causal diagram in which the correlation between mates (and thus 
parents of the next generation) may be expressed in terms of path and 
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correlation coefficients of the previous generation. This will finally 
lead to a recurrence formula for F between the successive generations. 
For example, continued brother-sister mating, the most extensively 
studied inbreeding system by statisticians, geneticists, and animal 
breeders, may be summed up by one expression: m = r/, (Fig. 7), where 


Yoo Too 
, m’ 
1 In’ 
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Oh eee Sa Opes e HP OY AOS es Fe, 
FIGURE 7 FIGURE 8 FIGURE 9 
Full sib mating Full sisters X half brother Double first cousin mating 
mating : 
m =r’oo m = a’2b’2(1 + 2m’ + 1/00) m = 2a/2b/2(m!’ + 7’’00) 


roo = 2(a’b’)*(1 + m’) according to (15). This single expression spells 
out the key fact of the system: the correlation between mates of one 
generation is the correlation between full sibs of the previous generation. 
Expressing everything in terms of F, F’, F’’, we immediately obtain 
the recurrence relation F = (4)(1 + 2F’ + F”), from which the cor- 
responding recurrence relation for heterozygosis may be deduced. Since 
the results of full sib mating are so well known, we shall examine another 
system of inbreeding in some detail. 

The mating system we propose to consider involves three mating 
individuals in each generation—one male (B) and two females (C, D). 
One mating (B X C, say) produces a son and the other mating (B X D) 
produces two daughters. The son (B;) is thus a half brother of the 
daughters (C; , D;) who are full sisters of each other (Fig. 8). In the 
next generation the matings will be B, X C, yielding a son and B, X D, 
yielding two daughters again, and so on. For brevity we shall call it 
the “full sisters x half brother’? mating system. The correlations 
between the three individuals of the previous generation are indicated 
at the top of Fig. 8. Now let us consider the correlation between mates 
of the following generation. The male and one of the females (e.g. 
concentrating on the one next to the male in the diagram) are connected 
by four chains: _ 
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(i) directly via their father, ...-. - a’b'b’a’ 

(ii) via father and female’s mother, . . . a/b’m’b’a’ 
(iii) via male’s mother and father,. . . . a’b’m'b’a’ 
(iv) via the two mothers, .... . . . a@b’roqb’a’ 


The correlation between mates, being the sum of these four compound 
path values, is thus 


m = a”b?(1 + 2m’ + 766) (16) 


which is the key formula of the mating system. The rest is simply a 
matter of algebraic substitution. From (12), we have a” = 1/2(1 + F’); 
from (13), b? = (4) + F’) and b” = (4). + F”); combining (12) 
and (13), b’a”? = ba’? = 3; from (14), bm’ = F’ and b!?m"” = F”; 
and from (15), ri, = 2a’’b’"(m'’ + 1). Substituting these values in 
(16), we have: 


m = a” [b? + 2b?m’ + 2b7a'?(b'?m"” + b’”)] 
F = b'm = 40 + F’) + 2 + QF’ + Gt FP) (17) 
F = de(3 + 8F" + 4F" + FY), 


The recurrence relation (17) represents the way the “full sisters 
X half brother’ system works on F. Each inbreeding system increases 
the values of F of successive generations in a peculiar way. The re- 
currence relation is an intrinsic property of the mating system, independ- 
ent of the initial condition of the population, number of loci involved, 
and the gene frequencies. This is another reason why F, the correlation 
between uniting gametes, plays such a dominant role in population 
genetics. Indeed, it is not too much to say that the major responsibility 
of path coefficients is to find the value of F in a population under various 
circumstances. 

There are many population properties closely related to F; but.they 
are beyond the scope of this review. We must, however, mention the 
relation of / and H, the proportion of heterozygosis in a population. 
The meaning of H may be viewed in three different ways, with as many 
different physical meanings. First, with respect to. one. pair of loci, 
H denotes the proportion of heterozygotes (Aa) in the population. 
Second, n individuals each having k pairs of loci may be regarded as a 
population of nk pairs of genes, ignoring the entities of individuals. 
Then H denotes the proportion of heterozygous pairs of loci in the 
population. Third, considering the many pairs of loci present in an 
individual, H denotes the proportion of heterozygous pairs in the indi- 
vidual. All these three viewpoints are useful, and, fortunately, they 
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are mathematically equivalent. Whatever interpretation is adopted, 
H is proportional to 1 — F in any generation; thus, 


iat Ff) A’ =A — FF’), HH’ = 71 — FS) os OCR, 


where H, is a constant (the initial value of H before inbreeding). Sub- 
stituting in (17), we obtain the recurrence relation of H: 


H = 1H’ 4 1H" 4+ 3H" (18) 


which is independent of the initial condition H, . The relation shows 
that H decreases in each generation as inbreeding proceeds. For 
example, if the inbreeding system starts out with one male ites to 
two full sisters (unrelated to him) and the initial value of H is 4, we will 
have the following H series in successive generations: 


4 8 13 23 40 1389 242 421 


: 
HB: 498° 16'32’ 64? 128512’ 10422048’ 


H/H’: 1,1, .8125, .8846, .8696, .8687, .8705, .8698, 


After a sufficiently large number of generations, the ratio H/H’ will 
approach to a constant value, implying that the heterozygosis will 
thereafter decrease at a constant rate. To find this constant eventual 
rate of decrease, we put H/H’ = H’/H” = Xin (18), which then becomes 


Ww —-W-H-¥% = 0. (19) 


The largest positive root of this equation is \ = .86995 = .870. That 
is, the heterozygosis decreases by 13% per generation under this in- 
breeding system. 

The results of continued double first cousin mating may be obtained 
in exactly the same manner. The six correlations between the four 
individuals of the previous generation are indicated at the top of Fig. 9. 
The mates are also connected by four chains. The key formula is given 
at the bottom of that diagram. The whole analysis takes but four steps 
analogous to (16), (17), (18), (19). These and previous results have 
been summarized in Table 3, in which the size N refers to the number of 
mating individuals involved in each generation. 

Finally, perhaps a few words should be said about the method of 
matrix algebra (brief account in Li, 1955, pp. 108-112, 116-118). Without 
doubt the matrix method gives a fuller account of the inbreeding process. 
For instance, with respect to one locus, it gives the frequencies of the 
various types of mating in each generation. However, it becomes 
difficult to handle when applied to a not-so-simple mating system, 
although Fisher (1949) has displayed great skill in such cases. On the 
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TABLE 3 


Some Results of Continued Inbreeding 


Mating system Size tecurrence relation Limiting value 

N medal OL iH sx 
Selfing i H = SH” » = .500 
Full sibs 2 16h eel Se deh » = .809 
Full sisters X half brother 3 H = 43H’ + +h” + 37H” A = .870 
Double first cousins 4 H = 4H’ + +H” + 3H” xX = 3920 


other hand, the path method yields the most pertinent information on 
an inbreeding system in very short order but little else. 


PANMICTIC FINITE POPULATIONS 


The general pattern of analysis with respect to a random mating 
group of N, males and Ny females is the same as that described in the 
previous section, viz., first obtain an expression for m, then deduce the 
recurrence relations of F and H, and finally solve for the limiting value 
of H/H’ = ). The important conclusion from this analysis is that 
heterozygosis in a finite population decreases, although maybe slowly, 
from generation to generation even though random mating is the rule. 
We shall not give all the details here except to say that the m in this 
case is the weighted average correlation for full sib and half sib matings 
as well as those between more remote relatives. Writing « for 1/N>o + 
1/N, = (No + N,)/NoN, , which in general is a small positive fraction, 
the recurrence relation of H turns out to be 


eS (1 x “\u BY ull. (20) 


and 
A= #4 —e+ Vl64+ €}. (21) 


The recurrence relation (20) and the limiting ratio (21) are exact and 
do not involve approximations. As far as the writer’s knowledge goes, 
these exact results have never been worked out by any other method 
although certain approximations for certain special cases have. 

The original expression for percentage decrease per generation given 
by Wright (1931, p. 108) is equal to 1 — , and the expression in his 
1951 review (p. 347) is\ — 1, a negative quantity. 

When JN, and N, are of moderate size, the fraction ¢ will be so small 
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F / » 2 ty 1S re . . . . 
that V16 + € = 4+ 36. With this approximation, the proportional 
decrease per generation in H becomes 


| i l 1 
anil) Getai--&) 
Sec ME Wer T.8NG/\0 a -SNgin BNa/s. aCe 
In the special case in which there are equal numbers of males and females 
(No = N, = 3N) ina group of N individuals, the fraction e = 4/N, and 


= Nos 1 , Eh aa , 

H =" i + =H (20’) 
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This is the basis of the frequently encountered statement that hetero- 
zygosis in a finite population decreases by 1/2N per generation. This last 
result is equivalent to the findings of Fisher (1930, p. 87) through a 
different method. After ¢ generations the heterozygosis will be e ‘/?” 
of the original value. Thus, putting e'“” = 4, we obtain t = 2N log, 2 
= 1.386N as the number of generations required to halve the hetero- 
zygosis proportion in a population of size N. It follows that a random 
mating population of a limited size will eventually, in the absence of 
other forces, attain complete homozygosis. This conclusion had a 
great impact in forming the modern theory of evolution and contri- 
buted much toward understanding natural populations. 


FURTHER APPLICATIONS IN GENETICS 


It does not seem practical to go much further into the subject. In 
the following we shall merely mention a few examples to illustrate the 
scope of generalization obtained by path method rather than discuss 
specific implications. 


1. Non-additive genic effects. As far as the proportion of hetero- 
zygosis is concerned, the above results are true whether there is domi- 
nance or not. Dominance only affects the various correlation values. 
The path coefficients can be applied to the case with dominance for 
unilineal relatives, but not for bilineal relatives (such as full sibs and 
double first cousins). Where the method applies, a path from genotype 
to phenotype may be introduced. 


2. Phenotypic assortative mating also leads to correlation between 
mates and also can be analyzed by path coefficients (Wright, 1921, III). 
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It must be admitted, however, that the analysis is a complicated one 
even with the help of path coefficients. A more recent treatment of this 
and some related subjects may be found in Reeve (1953). 


3. Equilibrium with inbreeding. A population does not necessarily 
reach complete homozygosis if the inbreeding is only of a moderate 
degree, or if the correlation between mates stays at a certain constant 
level in each generation. When an equilibrium condition is reached, 
F = im/(1 — 3m), a’ = 4(1 — 4m), b° = 3/(1 — 4m), so that the path 
from a parent to an offspring is ba = 4 whatever the mating correlation. 


4, Sex-linked inheritance. The method of path coefficients can be 
applied to the analysis of sex-linked loci just as easily as autosomal loci 
(Wright, 1933a). Only slight modifications are needed: the path from 
father (heterogametic sex) to his gamete is b = 1 because there is no 
segregation for a haploid; and the path from the male gamete to daughter 
is a as usual, but that to son isa = 0. The path from mother (homo- 
gametic) to her gamete is b as usual, but that from her gamete to son is 
a = 1 since the female gamete completely determines the genotype of 
the son (illustrations in Li, 1955, p. 182, 185). With these modifications 
analogous results for systems of inbreeding and panmictic finite popu- 
lations have been obtained. Further examples may be found in Crow 
and Roberts (1950). The separation of sex-linked and autosomal genetic 
variances of a quantitative trait can also be easily achieved by the 
method of path analysis (Reeve, 1953). 


5. Polysomic inheritance. In 1938 the method was extended by 
Wright to the analysis of the effects of inbreeding in the case of poly- 
somic loci. He carried the results much farther than had been reached 
by the matrix method, and obtained a general recurrence relation of H 
for 2k-somic loci for some simple systems of inbreeding. 


6. Linkage. The lack of independent assortment between loci 
introduces difficulty in analysis by any method. However, the inter- 
ference of inbreeding with the recombination of linked genes has also 
been studied by Wright (1933b). 


_ 7. Isolation by distance. Naturalists and ecologists have long been 
aware of the isolation effects of mere distance, even though the species is 
continuously distributed over a large area. The quantification of the 
problem and the subsequent analysis by Wright (1948, 1946, 1951 
Appendix F) is also based upon path coefficients and recurrence relations 
of F. The mathematics involved is unfortunately not as simple as we 
would like it to be, but the writer knows of no other treatment of the 
subject through a simpler method. 
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8. Environmental influence on a Mendelian variable may be arbi- 
trarily defined as the influences of all causes other than genetic. In 
path analysis a separate variable, assumed to be independent of heredity, 
may be introduced into the causal scheme so that the variations in a 
dependent variable may be completely accounted for. The causal 
scheme is then formally complete. 


9. Animal breeding. The studies of Wright have exerted much 
influence on animal breeding programs and helped in understanding 
some of the experimental results. Breeding, of course, is by no means a 
purely genetical problem. Those interested in this subject may refer 
to Lush (1949) and Lerner (1950). 


10. Theory of Evolution. It is entirely out of the scope of this paper 
to deal with Wright’s theory of evolution (which is quoted and discussed 
at length by Dobzhansky, 1951). Here it need only be said that path 
and inbreeding coefficients played a basic role in the early stages of the 
formulation of the theory. 
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It is frequently of interest in a biological investigation to estimate 
the number of different kinds of objects in some discrete population. 
Two examples are the number of species in a plant or animal community 
and the number of gene loci on a chromosome. 

Clearly any estimation procedure will depend upon the frequency 
distribution of the various classes in the population. There are two 
types of distributions which are of general interest and which are easily 
amenable to estimation procedures. The first of these is that in which 
the classes have some unimodal distribution which can be specified in 
terms of the modal frequency. Examples are the Poisson or any discrete 
distribution approximating a normal or log normal distribution. Preston 
(1948) has dealt with the problem of the number of species in a com- 
munity and has derived an estimate based upon an approximate log- 
normal distribution of the classes. This method applies for any of 
the class of distributions specified above. 

A second case of importance is that in which the classes are uniformly 
distributed. It is this case which will be dealt with here, the derived 
estimate being applied to the estimation of the number of genes on a 
chromosome. 


THE MAXIMUM LIKELIHOOD ESTIMATOR 


It is assumed that a population contains 7 classes, all of which are 
equally represented. If a sample of N objects is taken from such a 
population it will be found that r; objects fall into the 7th class. Then 
the probability of obtaining a given sample is the multinomial prob- 
ability 


(1) ee a 


*Contribution from the North Carolina Agriculture Experiment Station. Published with the 
approval of the Director as Paper No. 647 of the Journal Series. 
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Not all of the classes in the population will be represented in the sampie, 
however. Let us say k classes are represented at least once while 
(n — k) are not represented. This means simply that (n — k) of the 
r,’s in (1) are zero. Expression (1) then becomes 

ec lNe 
(2) Pee ema 


n 


a 
i=1 


Not all samples are distinguishable from each other however. It is not 
known which set of & classes of the n classes in the population are found 
in the sample. There are 

n!} 
(3) kin — k)! 


different samples containing k classes. In addition among the k classes 
observed it is not known which class is represented 7; times. There are 


(4) Tim r 


J 


ways in which k classes may be partitioned so that m; classes appear 7 
times. The probability of any observed or distinguishable sample is 
then by (2), (8) and (4) 


which is by definition the Likelihood function. 
Taking the logarithm of both sides of (5) and maximizing* with 
respect to n we get 


N ~ 1 
(6) n 7=A-k+1 J 
The value of ”% for which equation (6) holds differs by at most one, 
from the ML estimator of n. This expression may be solved for ” 
using a trial and error method from a table of the sums of the reciprocals 
of the integers. Since such a table is not generally available, use may 
be made of the approximation 


N A 
(7) A= In (24) (Feller p. 175) 


‘Strictly, (5) is not differentiable with respect to n. However, if the factorials are expressed as their 
equivalent I’-function, then differentiated, and the result evaluated for integral values of n, expression 
(6) results. 
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whose solution requires only a table of natural logarithms or exponentials. 

There are several points of interest to be noted about this estimator. 
First, expression (6) or (7) is identical in form to the solution of the 
coupon collector’s problem as given by Feller (1950). This problem 
concerns the expected value of N for a given value of k, when n is known. 
That is, how large a sample must be drawn to have k out of n possible 
classes represented in the sample. The close relation of this problem 
to our estimation problem is clear. 

Second, any departure from equal representation of classes in the 
population makes (6) an underestimate of n. This is because the more 
frequent classes will appear more often in the sample, thus lowering the 
value of k. 

Third, if the sample size N is smaller than or equal to the number of 
classes in the population, there exists a finite probability that all of the 
classes in the sample will be different. That is, k may be equal to N. 
Moreover the smaller is the ratio N/n, the greater is the probability 
that k equals N. Now an inspection of expression (7) shows that when 
k equals N, the estimate 7 is infinite. 

Last, the estimator (6) is sufficient. This can be seen as follows. 
The likelihood expression (5) is written as the product of two expressions. 
The left hand member in brackets is a function only of the observations, 
while the right hand member in brackets is a function of k and n. Then 
by definition & is sufficient estimate of n. It follows directly from the 
properties of the MLE that it too is sufficient. This is simply another 
way of saying that all of the information relating to m in the sample is 


contained in k. 


THE DISTRIBUTION OF THE NUMBER OF CLASSES IN THE SAMPLE 


To find the variance of the MLE it is necessary to find the expected 
value of k. In addition to its importance in deriving the variance, the 
expected value of k is itself of some interest as it is in a sense the reverse 
of the coupon collector’s problem. 

Feller (p. 69) has derived an exact distribution for the number of 
classes not represented in a sample when all the classes are equally 
represented in the population. If m is the number of missing classes, then 


(8) Se a. 


where all the notation has been previously defined by us. This exact 
distribution is unmanageable as it stands. However, Feller shows that 
this expression converges to a Poisson distribution. In this form 
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Dy at see nee 
(9) EE m! 
where 
(10) »= ne” 


The demonstration of equation (9) depends upon the fact that d is 
bounded. Should.» grow very large for some fixed value of N, \ will 
increase without bound. This limitation on the usefulness of the Poisson 
approximation is of great importance to the problem of finding confidence 
intervals for n and we shall return to it when confidence interval esti- 
mation is discussed. 

Before expression (9) corresponds exactly to the model in which we 
are interested, it must be noted that for our purposes m cannot be equal 
to n. That is at least one class is always represented. The desired 
probability distribution of m is then 


es P(m) 
CD Ms 1 — Pim =n) 
Now the expected value of m is 
~~ ee mer)” 1 
e) Em) = >» m! F — Pim = >| 


Letting m —1=e 


(13) E(m) = ut D mA at 


Note that the expression in brackets is equal to 1. Then 


(14) Ena =e 
but since 

(15) m=n—k 
(16) E(k) = n(l — e*”") 


Thus, we have derived an expression for the expected value of k. Ex- 
pression (16) may also be written as 


N n 
tla 
(7) n i n — E(k) 
which is virtually identical with our MLE and the solution of the coupon 
collectors problem. (17) itself may serve as an estimator of n. 
It is interesting that the validity of (16) is not affected by any 
assumption about \. If we allow n to grow very large in relation to N 


Ng is A gs Ne 
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in (16) it appears that the expected value of k is N. This is precisely 
what is to be expected. That is, if the sample size is very much smaller 
than n, all of the members in the sample will be different. 

Equations (12), (13) and (14) are simply the proof that the expected 
value of a variable with a Poisson distribution is the parameter \ of that 
distribution. This proof has been shown in detail here since a similar 
method will be used in the next section. 


Jom 


VARIANCE OF THE MAXIMUM LIKELIHOOD ESTIMATOR 


In general the variance* of a Likelihood Estimator may be written as 


2 —1 
Aces? = =a (Cramer, 1051) pt 500) 
6 


where L(x) is the Likelihood function. Then 


(19) pees te E ee ~ 


| 3) Nn 
Now nage 
oe) PES — nn — k + 1) 


for even moderately large k and n. Using result (11) and (15), the left 
hand member in brackets of (19) becomes 


n-1 n 2 m ur oe 


| | Ley) : 
sea eereemeang Se er cer) ae 
ing m + 1 =, this takes the form __ 
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It should be pointed out here that although this variance will give a 
standard error of the MLE, the relation of this standard error to actual 
confidence limits for n is not simple. The question of confidence limits 
will be explored in a later section. 


APPLICATION TO THE NUMBER OF LOCI ON A CHROMOSOME 


An important parameter in population genetics is the number of 
loci on a chromosome. It has, in addition, bearing on the basic problem 
of the structure of the chromosome and its relation to the structure of 
the genes. Many methods are available for estimating this quantity 
depending upon the criteria of “locus” used (see Herskowitz, 1950). 
The particular estimate most used in population genetics is that of the 
number of loci capable of lethal mutation, the assumption being that 
the mutation rate and selection coefficient are identical for all loci and 
that therefore in an effectively infinite population all loci will be equally 
represented by lethal mutations. 

The method used to determine the number of loci capable of lethal 
mutation is as follows. A number of organisms are sampled from a 
population. By appropriate genetic tests the details of which may be 
found in Wallace (1950) it is determined how many of these carry on 
one of their chromosomes a lethal gene. Since organisms in general 
possess a duplicate set of chromosomes, an individual carrying one 
chromosome with a lethal mutation will survive, the homologous normal 
chromosome serving to cover the lethal. Separate stocks are now made 
each of which derives from one of the original lethal bearing organisms. 
Each stock is then mated with every other stock so that N stocks will 
give rise to N(N — 1)/2 crosses. If two stocks both contain a lethal 
_ at the same position on the chromosome, that is at the same locus, then 
when these stocks are mated some of the offspring will have both of their 
chromosomes carrying the lethal and thus will die. On the other hand 
if the lethal genes contained in the two stocks are at different loci, none 
of the offspring will contain a lethal gene in double dose so that all the 
offspring survive. If two lethal genes are at the same locus they are 
said to be allelic. 

The procedure described above provides three items of information. 
First it serves to estimate the proportion of all chromosomes in the 
population which contain lethals. Second, among the N lethal bearing 
chromosomes it shows how many loci are represented once, twice, 
thrice, and so on. Finally, as a corollary of the second, it gives k, the 
number of different loci (classes) present among the N lethals tested. 
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THE FREQUENCY OF ALLELISM ESTIMATE 


The unique feature of the genetic estimation problem just described 
lies in the inability to identify two lethals as being identical, or at the 
same locus, without the process of mating. In a probabilistic sense, the 
mating procedure is simply a method for choosing pairs of lethals. If 
two members of the pair are identical the pair will be lethal. If the two 
members are not identical, the pair will be viable. Using this point of 
view Wright and Dobzhansky (1941) have suggested an estimate of n 
in the following way. If p; is the probability that the 7th locus is repre- 
sented in the sample by a lethal, then p; is the probability that a given 
pair of lethal chromosomes will both be lethal at the 7th locus. Since 
there are m loci all assumed to be equally frequent in the universe from 
which the sample is taken, the total probability that a given pair will 
consist of two identical members is 


(25) a mpi ==> =a 
Then ; ; 

paoise 2: a 
(26) To =i aie 


is an estimate of n, where @ is the observed frequency of allelism, that is 
the observed proportion of all pairs sampled which contained two 
identical members. 

_ There are several points to be noted in connection with this estimator. 
Like the MLE estimator it may give an infinite estimate of n when 
N <n. In this case @ may be zero which exactly corresponds to the 


probability thatk = Ninthe MLE. — 


No matter what the i size, n is upwardly biased since it is 


a 


clear that : ze ; i= 
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one-to-one correspondence between a and k. Thus all of the available 
information is not contained in a. 

There are two experimental designs to which the frequency of allelism 
estimate may be applied. The first of these has been described in the 
previous section and will be denoted as the complete cross design. The 
second method will be referred to as the random cross design. It consists 
in dividing m lethals into two sub-groups of size m/2 and making m/2 
crosses or pairs, one member of each pair coming from each of the two 
groups. From a probabilistic standpoint this simply consists in choosing 
m/2 random pairs of lethals. 

While the frequency of allelism estimate is applicable to both of these 
designs, the estimate will have a different variance in each case. For 
the random cross method, the variance of the estimate may be derived 
as follows. If an estimate, t, is asymptotically normally distributed, 
then in large samples the variance of any function of ¢ is given by 


2 
(30) oF) = pes G; (Cramér p. 354) 
For the specific case in question 
1 : 
Say a= o() - 2 


But @ is the proportion of successes in m/2 independent Bernoulli trials 
since it is the proportion of random pairs which show allelism. Then 


(32) aoe 2a(1 — a) 
2 m 

Finally from (31) and (82) 

(33) 3 2n7'(n — 1) 
at m 


which for moderately large values of n is approximately 
an 
m 


(34) 5 


II? 


When the complete crossing scheme is used, however, the variance 
cannot be arrived at in such a simple manner. This is because the 
N(N — 1)/2 crosses are not independent. The outcome of these crosses 
will be a unique function of the distribution of lethals in the original 
sample of size N. There will be a positive covariance among the crosses 
the value of which will depend upon the distribution of lethals in tiie 
sample. We have not been able to derive a satisfactory expression for 
the variance of 7% for this model. However, for an equal number of 


ESTIMATION OF CLASSES 219 


pairs the random cross scheme will have a lower variance than the 
oS cross design. By equal number of pairs we mean that m in 


e random cross is equal to N(N — 1) in the complete design. Ex- 
pression (33), then, represents the lower limit of the variance of 7. 
Setting m = N(N — 1), it is clear that the frequency of allelism 


estimate is Pikiciant as compared with the MZE since the efficiency 
of 7% is defined as 


(35) = 
Og N/n N 
€ aah Le my) 


which is always less than unity as can be seen from the expansion of 
the exponential. 

Again it should be pointed out that (35) is the efficiency of Wright’s 
estimate under the random crossing scheme as compared with the MLE 
under the complete crossing scheme when equal numbers of crosses are 
made for both. Should Wright’s estimate be applied to the complete 
cross design, it would presumably be even less efficient. 

Fig. 1 shows the efficiency of % for various values of N/n. 

Before the estimates can be compared for the number of loci on a 
chromosome, a correction must be made for the presence of more than 
one lethal mutation on a chromosome. It is not possible to distinguish 
a chromosome with more than one lethal from those with only one. 
An additional datum required for this correction is the frequency of 
lethal bearing chromosomes in the population from which the sample of 
lethals was derived. On the assumption that the number of lethal 
genes in a lethal chromosome has a Poisson distribution, Wright showed 
that his estimate takes the form 


(36) fi = [=Ba= Ost ; 


where & = frequency of allelism of lethal genes 
A = frequency of allelism of lethal chromosomes 
Q = frequency of lethal chromosomes in the population 


This same correction factor may be applied to the Maximum Likelihood 
estimate where N and k in eq. (7) or (9) are calculated from: 


(37) sensing es a 
aes — . 
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Bernoulli trials, upper and lower confidence limits for a are found by 
use of tables of the binominal sums, or else binominal probability paper. 
The upper and lower confidence limits for n are then simply the recipro- 
cals of the values for a. When the complete cross design is used, how- 
ever, this simple method does not apply. Unfortunately several authors, 
(Pavan & Knapp, 1954; Wallace, 1950) have given confidence limits 
based upon the random cross model, while actually employing the 
complete cross design. Such confidence intervals are clearly too small. 
One of the present authors (R.C.L.) is responsible for this error in the 
paper of Pavan & Knapp. What the correct limits are can only be 
stated when the distribution function of @ under the complete cross 
design is demonstrated. 

Exact confidence intervals for n using the MLE can be found in 
theory, although greater or less practical difficulties may ensue depending 
upon the size of the sample. Expression (8) is the exact distribution of 
k given the sample size and n. The upper confidence limit for n is then 
the value of this parameter which when substituted into expression 
(8) makes 


Pr {k < observed k} = : 


where 1 — a is the confidence coefficient, let us say .95. 

Unfortunately for N and n of even moderately large size, the sum- 
mation of (8) becomes prohibitive both in terms of time and accuracy 
unless special computing machines are available. 

It is possible, then, to use the approximate expression (9). This is 
satisfactory for the lower confidence limit, but unless N is quite large 
with respect to n, the upper confidence limit cannot be estimated by 
the use of this expression. This is because, as we have pointed out, the 
approximation only holds if \ is bounded. Now for a fixed value of N, 
» increases without bound as n grows very large. The result of this 
increase is that the upper confidence limit for n will be grossly over- 
estimated and may often be infinite. If exact confidence limits are 
desired, recourse must be had to the exact distribution (8). : 

For a large sample, of course, the MLE is approximately normally 
distributed with a variance given by (24), so that an approximate 95% 
confidence interval will encompass 1.96 standard deviations on either 


side of the estimate 7. 


A SPECIFIC EXAMPLE 


With the above reservations in mind we may contrast the frequency 
of allelism estimate with the MLE for two sets of data. The pertinent 


* a 
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information is shown in Table 1. The data on the third chromosome 
of Drosophila pseudoobscura are taken from Wright & Dobzhansky (1941) 
while those on the second chromosome of D. melanogaster are from 
Wallace (1950). In both cases the frequency of allelism estimate is larger 


TABLE 1 
Comparison of Maximum Likelihood Estimate and Frequency of Allelism Estimate 
for Number of Loci on a Chromosome. For Symbols, see Text. 


NAR LO eA S.E. S.E. 
Wright | MLE | MLE | Wright 


3rd chromosome 
D. pseudoobscura 105 | 86 |.1529|.00407} 289 268 55.2 66.4 


2nd chromosome 
D. melanogaster 100 | 87 |.1209|.0028 406 342 85.2 116.1 


than the Maximum Likelihood estimate as might be expected from the 
bias of the former. However, since nothing has been demonstrated about 
the bias of the MLE, too much weight should not be given this point. 
Likelihood estimates are not, in general, unbiased. The standard errors 
shown for Wright’s estimate are minimum values on the assumption of 
a random cross model. Actually the complete cross design was used in 
both so that the true standard errors are somewhat higher. 


THE CHOICE OF AN ESTIMATE 


Each of the estimation procedures described has certain disadvan- 
tages. If a random cross procedure is used, the MLE is not applicable 
because the requisite information, k, cannot be obtained from the data. 
The random cross procedure has the advantage that exact confidence 
limits can be placed upon the estimate of n. If the complete cross 
design is used, the MLE is superior since it is more efficient, as well 
as being sufficient. Moreover the frequency of allelism estimate is 
__ demonstrably biased and insufficient, while it shares with the MLE the 
difficulty of-establishing exact confidence limits. The choice then lies 
between the random cross design using Wright’s estimate and the 
complete cross design using the MLE. When the sample size is small 
compared with n, the efficiencies of the two methods do not differ greatly 
as shown in Fig. 1. This does not take into account however, that for 
an equal number of crosses to be made in the two methods, the original 
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task of sampling the parent population is greater under the random 
cross design. That is, N(N — 1) lethal chromosomes are required to 
make N(N — 1)/2 random crosses, while only N lethal chromosomes 
are necessary for this many crosses in a complete design. The labor is 
then approximately NV times as great in the first stage of the operation. 

Pavan & Knapp found 1,063 as the estimated n in the second. 
chromosome of D. willistoni using a modified complete cross design. 
These authors contrast this with Wallace’s estimate of 406 for the second 
chromosome of D. melanogaster (see Table 1). As they point out, 
these two chromosomes have been found by other means to be homol- 
ogous, that is they contain more or less the same loci. However, 
since the confidence limits for these two estimates are much larger 
than these authors show, the results of Pavan & Knapp and of Wallace 
are not really in conflict. 

What these data demonstrate is that large differences in the esti- 
mated number of loci may have no significance whatever because of the 
general inefficiency of estimation procedures. 

In general, neither of the estimates is very efficient when sample size 
is small as compared with n. Moreover-unless N > n both methods 
run the risk of yielding an infinite value for ” or %. This objection 
applies to the random cross design, no matter what the size of N, 
although decreasingly so for large N. If estimates of the number of 
loci are to be made then, it would be best to perform a large scale experi- 
ment with a sample size in excess of the current approximate values of n. 
Such an experiment, using the MLE would provide a reasonably secure 
estimate of n. The estimates thus far obtained provide only orders of 
magnitude rather than precision. 
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QUERIES 


Grorce W. SnepeEcor, Editor 


QUERY: In Query No. 102, June 1953, you gave a method for 
121. supplying the yield of a missing plot in an unreplicated factorial 

experiment. You indicated that the test of treatments is biased 
upwards but that the bias is assumed to be small. I have a similar 
experiment in which one of the treatment means is significant at about 
the 3% level. I would like to know if the bias is sufficient to change the 
conclusion. How can I determine the amount of bias? 


The amount of bias can be found and subtracted from the 
ANSWER: approximate sum of squares for treatments. The deter- 

mination of the amount of bias is made in the following 
way. 

Insert the symbol z for the missing value and obtain the expressions 
for the sums of squares corresponding to the individual degrees of 
freedom supposing that x is a number. Each sum of squares is then 
either a number independent of x or a quadratic in x with known 
coefficients. Denote the error sum of squares by E(x) so that 


E(x) = ax’? + 2bx +c 
and the sum of squares corresponding to the effect of interest by 7'(x) 
with 

T(x) = dx? + ex + f. 


Then the minimum of E(x) occurs when x = a = —b/aand is Hui, = 
c — b’/a. This is the value of the error sum of squares used in both 
the exact and approximate tests of significance. The value of the effect 
sum of squares used in the approximate test of significance is 


2be _ db 
Taiaed ieee 
On the other hand the minimum of H(«) + T(x) occurs for 
pce Mesias. i 
ad 
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and is 
? _ +0. 
(E+ T)min =ce+ f atd 


The effect sum of squares for the exact test of significance is then 
given by general linear hypothesis theory as 


(E <i T) min =e Evin = f Re ee as aS 


_ Hence the bias in the effect approximate sum of squares is 
Bias — dip: > (£ + T) min a Ewinl 
db” 2be_ ib? , (b+ 6)’ 


SOS ———- .. 


a+d a-+d a 
_ As an example consider the evaluation of the bias i in 1 the SK effect 
in query 102. 2; ee 
The sum of squares for the effect if 
T(@) = ve [2 _ 56.60 + 800.89]. 
Hence 
1 ci ieiees ; 
cay Batts baloe. " - ~ 12x 2 x Pee a pe 


a2 SL EG) = 0.502" — 23.622 + 381.85. mee 
sy = O50and T= 182 


ABSTRACTS 


Papers presented at the Third Colloquium of the German Region of the Biometric Society 
at Bad Nauheim, January 27-29, 1956 


R. K. BAUER. (Munchen). General Theory of an Anthropo- 
metric Test of Paternity. 


366 


A test of paternity is carried out in two steps: first the survey, then 
the evaluation of evidence on the observational unit. The observational 
unit is 3-valued usually and consists of child, its certain mother and 
uncertain father. The evidence consists of combinations of, say, MW 
hereditary characters, which can be found on the observational unit. 
As a result of the survey we get a matrix X of 3 X M variates. The 
best criterion of evaluation is the ratio of the probability of X repre- 
senting a “true” family, and the probability of X representing a “‘false”’ 
family. All anthropometric test methods for proof of paternity are 
based on this criterion. Practical statements, however, lead perforce 
to LUDWIG’s proposal, i.e. to abbreviated discriminant functions. 
The results of a recently finished model experiment are shown. 


W. U. BEHRENS. (Hanover). Problems in Correlation Sta- 
tistics. 


367 


Frequently too much confidence is placed in correlation analysis. 
Significant correlation is no proof of a direct causal relationship. Re- 
gression lines are not suitable for representing structural relationships. 
Particular difficulties arise with the interpretation of partial correlation 
coefficients. The usual treatment in textbooks leads to the mistaken 
conclusion that partial correlation is a key for detecting causal rela- 
tionships between single variates. The danger in such interpretation 
is outlined and demonstrated by means of models and agricultural 
experiments. There is no argument against the use of regression for 
fitting empirical data. 


368 W.U. BEHRENS. (Hanover). The Fitness of Different Field- 
Trial Designs for the Removal of Soil Differences. 


For the layout of field trials with 4 to 10 treatments Latin square 
and block designs are frequently used. Under the assumption that 
small differences between treatments allow for efficient screening of 
soil differences, various field trial designs are discussed and particularly 
suitable (‘gerechte’) designs demonstrated. Highest precision of these 
designs is obtained with square or near square plots in contrast to long 
rectangular plots, though generally the latter allow a better screening 
of soil differences. 
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369 H. GEIDEL. (Rethmar). On Symbols and Terminology for 
Use in Agricultural Biometry. (Report.) 


A. JANOSCHEK. (Giessen). Quantum Biology and Reaction 
Kinetics. 


370 


The time-trend of the decline of a homogeneous population of 
organisms under the effect of a noxious principle agrees with a reaction 
which can be formularized by the GLOCKNER proposition. The 
dose effect relation on the other hand is more informative generally. 
The effect of time (¢) and dose (D), combined in a single formula, yields 


N = M{1 — exp [—(k-D)?-#]}” 


This formulation of the law of mass action applies also to the rate of 
population decline and growth. Examples show general validity of 
this basic law of reaction kinetics. 


371 G. A. LIENERT. (Marburg). Quantitative Analysis of Test 
Methods in Clinical Medicine. 


As yet no statistics have been defined to measure the reliability 
of clinical testing methods, other than the usual description by a2 X 2 
table of percentages (negative/positive reaction X healthy/ill persons). 
The use of a suitable correlation coefficient as a means of characterizing 
reliability and symptomatic value is suggested; a possible conventional 
interpretation of the coefficient is outlined. 


O. LUDWIG. (Bad Nauheim). Theory and Application of 
Runs. 


372 


Several kinds of runs are defined, and their importance for tests 
of random order in sampling for attributes is explained by means 
of examples from genetics, plant breeding, and meteorology. Several 
conditional and unconditional tests are considered for two or more 
characters. The theory may be applied to observations on variables 
too, e.g. by considering runs above and below the median. Runs up 
and down and runs of consecutive elements are also defined for variables _ 
(measurements); but their distribution under the null’ hypothesis 18+ 
entirely different from that of the above-mentioned runs. In a certain 
sense an inversion of common theory is the ‘Pascal problem”’ (inverse 
sampling) for runs; the connection for this problem with that of direct — 
sampling and with FELLER’s theory of recurrent events is dealt with. 
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K. H. MULLER. (Jena). On the Accounting for Soil Varia- 


373 bility in Field Trial Analysis of Variance. 


In field trials the designed layout, i.e. the number of blocks, may or 
may not correspond with the number of actual different soil qualities. 
In the first case the block degrees of freedom have to be used in the 
statistical analysis; in the latter case it seems appropriate to allocate 
the experimental material and to assign the degrees of freedom cor- 
respondingly to realized soil groups. In that case, this seems to be a 
logical procedure to ensure a separation of systematic and random 
fluctuations. 


H. RUNDFELDT. (Hanover). Review of Methods Usually 


374 Applied in Field Plot Technique. 


In field plot experiments the existence of soil variation by which 
means and variances might be considerably biased must be always 
taken into consideration. It is therefore necessary to reduce its influ- 
ence by choice of a suitable layout and statistical analysis of the experi- 
ment. In respect of the layout long narrow plots are preferable. Re- 
garding the analysis various systems were compared by means of dummy 
experiments. Based on the theoretical results of the evaluation of an 
experiment without soil differences (s” = 100) and the information 
resulting from a given field one finds the following “rule-of-thumb 
numbers”’ for the tested systems: A. for few treatments, 1. without soil 
balance (completely randomized designs) 250, 2. randomized blocks 
125, 3. method MITSCHERLICH 125, 4. Latin squares 120, 5. method 
LINDHARD 120, 6. comparison by means of systematic controls 200. 
B. for a greater number of treatments, 1. systematic controls 200, 
2. random controls 200, 8. lattices 130-150. Besides these “rule-of- 
thumb numbers’ one will notice various technical advantages and 
disadvantages in applying the different systems. 


375 B. SCHNEIDER. (Giessen). On the Usefulness of Percentiles 
in Biometry. 


The estimation of standard deviation from large samples by approxi- 
mation formulas frequently introduces uncontrollable and_ principal 
errors into further considerations. As a measure of dispersion the 
percentiles or a function of the same (e.g. semi-interquartile-range) are 


to be preferred; the usefulness of this standard deviation estimator is 
demonstrated. 
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W. SCHNELL. (Scharnhorst). On the Sphere of Permissible 
Generalization of Field Trial Results. 


376 


Field trials are considered in which several “yield” factors such as 
“treatments” and “blocks” have been modified, the modifications 
(or “‘levels”’) of each factor being orthogonal to the modifications of the 
other factors. (EISENHART’s model I and II, mixed model.) The 
sphere of permissible generalization of the result of a comparison is 
said to comprise, with respect to some orthogonal factor modified at 
random, the corresponding population of modifications of that factor; 
however, as regards some orthogonal factor with fixed modifications, 
the generalization sphere is confined to the modifications actually used. 
As a special case, if the factor in question is acting additively within 
the limits so described, the generalization sphere widens as far as 
additivity holds. Asa rule, there are sizable interactions of ‘‘treatments”’ 
with “‘soils’’ and “years”, respectively. One-year trials, though giving 
an unbiased estimate of the long-time average of some interesting 
comparison, do not furnish an estimate of the appertaining interaction 
with years and hence will not yield a valid error, if recommendations 
for future years are to be made. The resulting need fer trials replicated 
over several years suggests the applicability of sequential sampling 
procedures to field experimentation. 


377 Kk. v. SOLTH. (Marburg). On Studies of the Gestation Process 
Under Different Gynaecological Diseases. 


The question of whether a relation exists between the ratio of 
abortions to births and different genital diseases is studied in cases of 
myoma (935), collum (450) and corpus (143) cancer. A method for 
characterizing this existing relation is given. 


8 R. WARTMANN. (Dusseldorf). Analysis of Variance When 
37 the Population is Finite. 

Multistage sampling of.a finite population with p’, q’, 7’, --- , units 
at each stage from which p < p’,q <q,7r <1”, --: , are sampled. 
The precision of a single one of the p-qg:r- --- samples is asked for. 
Formulas for the variance, and separation of its components of a single 
sample are given. | 

M. WERMKE. (Bochum). Combined Analysis of Variance of 
379 Field Trials by Computing Machine Equipment. 
The technique of computational procedure with excessive field 


trial material by use of IBM punched card equipment is detailed 
for a two-factorial analysis of variance. Generalizations. 
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R. WETTE. (Heidelberg). On the Use of Regression Lines in 


380 Biology. 


Frequently the conditions for identifiability of a linear or allometric 
structural relationship between two variables subject to error are not 
met in biological material. Application of the usual regression procedures 
to material of this kind yields fallacious results when the causal relation- 
ship between the structural variables is asked for, a fact frequently 
neglected. 


Papers presented at the first meeting of the Brazilian Region of the Biometric Society, 
at Instituto Biologico, Sao Paulo, Brazil, January 3, 1956 


381 A. CONAGIN. New Tests for Comparison of Means. 


In this paper the author discusses new tests for comparison of two 
means or contrasts ‘“‘a posteriori’ when the null hypothesis of a group 
of n means is rejected at an a@ level. 

The tests considered are the least significant difference test and also 
tests of Newman, Keuls, Tukey, Scheffe and Duncan. The essential 
differences between them are pointed out in detail and the rules applied 
to a particular situation. 


382 A. GROSZMANN AND J. DOBEREINER. Problems in the 
Statistical Analysis of Population Growth in Bacteria. 


Two strains of bacteria of the genus Beijerinckia were studied for 
different purposes. Strain ‘“#”’ was very efficient in free nitrogen 
fixation, fixing 15 mg of N, strain ‘“F’’ was low, fixing only 8.5 mg of 
N per gram of sucrose. The experiment on counting the number of 
bacteria in the population was run in a split-plot design, with two 
replications, “EH”, “F” and “0” (check) strains, countings made at 
1, 2 and 3 week intervals, in two different culture media. The data 

were analysed on number of bacteria found under the field of the 
~ microscope. Ten fields were counted from every treatment. The 
analysis showed a C. V. = 38.9% for countings and a C. V. = 10.3% 
for “plots”. In spite of the relatively large error term, highly significant 
differences were found between strains and among weeks. The difficulties 
in sampling the population and making bacteria counts under the 


microscope were discussed. y 


es 
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W. R. JARDIM, A. M. PEIXOTO, 8. SILVEIRA FILHO AND 

383 F. PIMENTAL GOMES. Study on the Precision and Accuracy 
of a Few Practical Methods of Estimating Milk Production of 
Dairy Cows. 


This paper deals with the estimation of milk production by means 
of biweekly, monthly and bimonthly observations, without taking 
into account, as is usual practice, the date of calving. The data studied 
were 72 records of cows of the Holstein-Friesian breed: 6 calvings 
in each month of the year and also 12 first calvings, 12 second calvings 
and so on, up to the sixth. These cows belong to the herd of the Escola 
Superior de Agricultura “luiz de Queiroz” (Piracicaba, S. P., Brazil), 
which has been kept since 1914 under approximately the same condi- 
tions. 

The authors criticize the use of ‘‘maximum error’’ and also the use 
of mean deviation, both to be found in papers dealing with this subject. 
The former is completely superseded and inadvisable, and the latter, 
although equivalent to a certain extent to the usual standard deviation, 
has only 87.6% of its efficiency. 

The data obtained were compared—with the actual production, 
corresponding to daily control, and the deviations observed were 
studied. Their means and standard deviations were the following, in 
kilograms per lactation period. 


Method Mean of Deviations Standard Error of Mean 
Biweekly control + 7.59 7.9 
Monthly control + 8.92 1B 
Bimonthly control +121.86 21.7 


Comparison of these means with the expected value (zero) under 
the null hypothesis, by the ¢ test, shows that biweekly and monthly 
controls may be taken 4s unbiased, while bimonthly control is biased, 
the bias being positive and around 5% of the actual production. 

An analysis of variance of the observed deviations was carried out, 
this being correct in view of recent research by G. E. P. Box (Ann. 
Math. Stat. 25: 290-302, 1954). The analysis, completed by Tukey’s 
studentized range test, shows that, with respect to a possible bias, the 
biweekly and monthly controls may be accepted to be equal to each 
other, but that they are both significantly different from the bimonthly 
control. 
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RUY AGUIAR DA SILVA LEME. (Escola Politecnica da 
384 Universidade de Sao Paulo). The ‘‘Simplex’’ Method in Multiple 
Regression. 


A modification in the method of Pivotal Condensation (Rao, Ad- 
vanced Statistical Methods, 1952) allows a prediction to be made of the 
improvement afforded by a new independent variate on the sum of 
squares due to regression. The method of Pivotal Condensation thus 
modified becomes formally equivalent to the “Simplex”? Method of 
Linear Programming. 

An actual example is entirely worked out for illustration. 


A. M. PENHA, G. SCHREIBER, A. R. HOGE AND H. HE. 
385 BELLUOMINI. Application of the Discriminant Function to a 
Problem of Sex Differentiation in Snakes. 


In a species of snake (Bothrops insularis) inhabiting exclusively a 
small island off the coast of the State of Sao Paulo, a large proportion 
of individuals, tentatively considered as inter-sexual, externally resemble 
males (male copulatory organs present) but after dissection show fertile 
female internal organs. 

With the values of 3 external morphological characters—head, 
body and tail lengths—Fisher’s discriminant function was calculated 
for a group of 155 typical males and 65 typical females of the afore- 
mentioned species. As indicated by the analysis of variance, the 
function had a highly significant discriminant power (F = 78.7). Its 
probability of misclassifying a randomly taken individual was 0.13, 
according to the corresponding ‘‘’’ value, whereas the observed variates 
were much less discriminating, showing the following probabilities of 
misclassification under the same conditions: head length, 0.35; body 
length, 0.44; tail length, 0.38. Furthermore, the isolated contribution of 
each measurement to the discriminant was highly significant. 

When the calculated function was applied to a group of 151 inter- 
sexual individuals, 93% were classified as females, which indicated a 
proportion of misclassification well below the theoretical expectation 
for typical females. 


M. ROCHA E SILVA AND A. M. ROTHSCHILD. (Instituto 

386 Biologico, Sao Paulo, Brazil). A 4-point Design for Bioassay of a 
Material Inducing Strong Tachyphylaxis (Anaphylatoxin), With 
a Reference to the Mechanism of Desensitization. — 


It has been shown in this laboratory that anaphylatoxin prepared by 
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incubating normal rat plazma with agar owes its stimulating action 
upon the guinea pig ileum to a release of histamine. The contracting 
effect is, therefore, indirect, and the response becomes less and less as 
the additions of the same dose of anaphylatoxin are repeated (tachyphy- 
laxis or desensitization). The possibility of comparing two different 
preparations of anaphylatoxin, a “standard” and the “unknown” by 
applying them alternatively to the same piece of intestine would give 
highly inaccurate results on account of the enormous bias introduced 
by the previous additions of the agent. By using the known set-up of 
a 4-point assay with randomized blocks as utilized for histamine, 
bradykinin, oxytocin and so forth, the bias introduced by the tachyphy- 
lactic effect is so strong that only exceptionally could any reasonable 
ratio of potency be derived from the experimental data. 

After trying several designs in order to test the possibility of setting 
a 4-point assay for a quantitative estimation of anaphylatoxin, we have 
found that a-4 X 4 latin square design is able to correct for the bias 
introduced by the tachyphylactic effect, allowing a reasonable estimate 
of the ratio of potency between “unknown” and “standard”. The 
variation due to the tachyphylactic effect could be confounded with the 
so-called ‘‘order of additions” (““Rows’’) and its sum of squares eliminated 
from the experimental error, thus reducing to about 10 to 12% the 
standard error of each response. In a series of 19 squares the variances 
were found homogeneous, and the value of \ = S/b around 0.10. 
Deviations from parallelism and linearity were X trivial and non- 
significant in nearly all experiments. The standard error of M (= sy) 
was, on the average, around 0.030. 

A coefficient of desensitization (A) is described and the method for 
its calculation set up. The negative regression of ‘Rows’ on “Bias” 
(i.e. the number of previous additions of the same dose of anaphylatoxin) 
indicating a linear downward trend of the responses to the same repeated 
dose of anaphylatoxin was explained as a linear regression of response 
on dose of “intrinsic histamine”. From these considerations, a function 
D = Ce’ for doses of ‘intrinsic histamine” has been derived and the 


responses x 


Y= k-log; D = k-log, C — kz, where —log, C= A 


should be identified with the linear equation of the regression line 
“Rows” on “Bias’’, indicative of the responses to the released intrinsic 
histamine at the different levels of bias (0, 1, 2,3, ---). The coincidence 
between the theoretical equations and the experimental ones calculated 
by the least squares method is remarkable. 
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Paper presented at the Joint Meetings of the American Institute of Mathematical 
Statistics, American Statistical Association, and the Biometric Society 
(ENAR), New York, N.Y., U.S.A., December 27-29. 


387 GEORGE FERRIS. Three Useful Designs in Taste-Testing. 


There has been a tendency to use statistical designs intended for 
other fields of application in taste-testing, whether the model is appli- 
cable or not. Alternatively, non-parametric methods have been used. 
It is the-intention of the paper to show how existing models can be 
suitably modified for organoleptic purposes. 

The first model is intended for use by analytic taste-panels for 
quality control purposes when a number of samples are judged serially 
for flavor. It takes existing Latin Squares, Youden Designs, and In- 
complete Blocks, and modifies them in order to control “residual” 
or “carry-over” effects. 

The second model is intended for use by analytic panels judging 
color or another physical property in samples simultaneously set out 
alongside one another. It modifies existing designs and analyses to 
allow for the phenomenon of ‘‘adaptation’’ known to psychologists. 

The third model illustrates how a balanced design for incomplete blocks 
of two may by a slight variation be adopted for organoleptic purposes. 

Each model is illustrated with a numerical example. 

Consumer testing is briefly referred to. 


Paper presented at the meeting of the French Region of the Biometric Society, February 8. 


388 D. SCHWARTZ, J. ULMO ET A. VESSEREAU. Problémes 
Relatifs 4 ’Echantillonnage Stratifié. 


Il s’agira, au cours de cette communication, de problémes d’échantil- 
lonnage stratifié ot le nombre de prélévements est le méme dans les 
diverses unités.A un méme niveau, par example: 
dans la production d’un centre de fabrication d’ampoules de B.C.G., on 
choisit au hasard 1 lots de fabrication, on préléve dans chacun d’eux 
un méme nombre a d’ampoules, donnant lieu chacune & ¢ tubes de cul- 
tures sur lesquelles on mesure le nombre de particules vivantes 
ou bien: 
dans une livraison de papier on choisit au hasard r rames, dans chacune 
on préléve au hasard f feuilles, chacune donnant naissance a un méme 
nombre de mesures de résistance & |’éclatement. 

Kt en particulier, la catégorie des problémes & deux niveaux, ot on 
représente une population par e échantillons donnant lieu chacun & m 
mesures. 

On exposera en particulier quels sont les plans d’échantillonnage 
optimum selon qu’on s’intéresse plus particulitrement A la valeur moyenne 
de la caractéristique étudiée, ou & sa variabilité aux divers niveaux. 


THE BIOMETRIC SOCIETY 
General 


The paid up membership of the Society at 31st December, 1955 
stood at 1265, divided between regions as follows:— 


E. N. American 483 German 61 
W. N. American 95 Indian 12 
Australasian 50 Italian oO 
Belgian 62 Japanese 43 
Brazilian D2 Netherlands 29 
British 155 Swedish 16 
Danish ee Swiss 22, 
French 65 At large 51 


The office of the Secretary has been transferred from New Haven 
to Rothamsted during the summer. Members are asked to excuse any 
delays that have arisen during the transfer. 


Brazilian Region 


At a meeting held on January 3rd at the Instituto Biologico, Sao 
Paulo, the following were elected for 1956: 


President, Dr. C. G. Fraga, Jr; 
Secretary, Dr. P. M. Freire; 
Treasurer, Prof. A. Groszmann; 
Committee, Dr. Breiger 

Dr. Bueno 

Prof. Bitancourt 

Prof. Memoria 

Dr. Penha 

Dr. Conagin. 


The following papers were presented at the same meeting—A. M. 
Penha and collaborators, on a problem of sex differentiation in snakes; 
M. Roche e Silva, on a factorial assay for anaphylotoxin; R. Silva Leme, 
on a new procedure for multiple regression; A. Groszmann, on bacterial 
growth curves; A. Conagin, on the comparison of several means; and 
F. Pimental Gomes and collaborators, on the evaluation of milk pro- 
duction. 


Belgian Region 

En novembre 1955, Prof. Lousse, Recteur de 1|’Ecole de Médicine 
Vétérinaire de Cureghiem a donné une conférence entitulée “Les re- 
cherches pures et appliquées en physiologie animale.”’ 

La Société Adolphe Quetelet a tenu son assemblée générale a Brux- 
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elles le 22 mars. Le Conseil d’administration pour 1956 se compose 
comme suite: Président, Prof. D. de Meulemeester; 
Vice-Présidents, MM. de Naeyer, Laurent, Lebrun, Welsch, 
Lecrenier; 
Secrétaire, M. L. Martin; 
Secrétaire adjointe, Mlle. A. Lenger; 
Trésorier, M. A. Rotti; 
Membres, MM. Roussel, Reuse, Bontemps. 
Ensuite, Mlle. Lenger a fait un exposé sur le sujet ‘“Quelques . 
commentaires 4 propos d’un voyage d’étude aux Etats-Unis.” 


German Region 


A colloquium held at Bad Nauheim during January 27-29 was 
attended by 150 biometricians, including 35 from East Germany and 
guests from Austria and Chile. Topics included:— 


I Distribution-free methods. L. Schmetterer, A’ general survey; 
H. Munzner, Permutation tests. 


II Biometry and Medicine. W. Oemisch, Graduation of growth 
data by the Normal curve; H. J. Heite, Statistical treatment of experi- 
mental mortality curves; J. Hartung, Sickness Insurance; G. Bertram 
and J. Hartung, Remarks on sequential designs; H. Hosemann, Para- 
bolic curves of neo-natal mortality; G. A. Lienert, Quantitative analysis 
in clinical research; K. V. Solth, Effects on gestation of various gyne- 
cological diseases. 


ITI Statistical Methods in Agriculture. H. Geidel, A report on 
symbols and notation for agricultural biometry; W. Behrens, Correlation 
problems; H. Rundfeldt, A Critique of current techniques in agricultural 
research; W. Schnell, The accuracy and validity of field experiments; 
W. Behrens, The suitability of various experiment designs in the study 
of soil variation. 


Région Frangaise 


La Régiin a tenu son assemblée générale 4 Paris le 8 février. Ont 
été elus—Président, M. E. Morice; Membre du Conseil, M. D. Bargeton. 
M. D. Schwartz, Mlle. J. Ulmo et M. A. Vessereau ont presenté une 
communication entitulée ““‘Problémes relatifs a Véchantillonage stratifié.’”’ 


Australasian Region 


' Visit of Professor M. G. Kendall: Professor Kendall gave a series of 
lectures and seminars in Sydney, Canberra, Adelaide and Melbourne to 
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statisticians, economists, public servants and the general public on a 
wide variety of topics. The Region owes a great deal to C.8.1.R.O. for 
allowing non-C.S$.I.R.O. members to see so much of Professor Kendall 
during his tour. A conference in Melbourne, 9th—13th April, was 
attended by 50 delegates from the Division of Mathematical Statistics 
C.S.I.R.O., the Universities, other Divisions of C.S.I.R.O. and Govern- 
ment departments, and from Commercial Organizations. It was easily 
the largest gathering ever held of Australian mathematical statisticians. 
On the two free afternoons in the week, visits were arranged to the 
Olympic Site, the Electronic Computer at the University of Melbourne 
and the Division of Meteorological Physics, C.S.I.R.O. 

The first three days of this Conference, with Dr. G. S. Watson in 
the Chair, were devoted to Time Series. Professor Kendall gave two 
lectures in which he sketched an outline of the subject as it stands 
today. In the first lecture the distribution theory of serial correlation 
coefficients was discussed. The second lecture dealt with questions 
of inference. In general the discussion was limited to the case of auto- 
regressive processes in discrete time. Contributed papers were read 
by: Dr. D. G. Lampard, ‘‘A method of estimation of correlation func- 
tions’; Mr. E. K. Webb, “Spectrum Analysis of continuous time 
series”; Mr. R. T. Leslie and Dr. F. E. Binet, ‘“‘Relaxed runs’; Dr. G. 8. 
Watson, ““The joint distribution of circular serial correlation coefficients” 
and Dr. E. J. Hannan, ‘Testing for serial correlation in least squares 
regression’’. 

The subject of the last two days of the Conference, with Professor 
E. J. G. Pitman in the Chair, was Distribution-Free Inference. In the 
first of his two lectures on this topic, Professor Kendall gave a general 
discussion and attempted to define and classify some of the concepts 
involved. In the second, he classified many distribution-free tests 
and compared the relative asymptotic efficiencies of alternative tests 
of the same hypothesis. Contributed papers were read by: Dr. H. 8. 
Konijn, ‘‘Distribution-free procedures for testing treatment differences”; 
Dr. H. O. Lancaster, ‘Some reconciliations of x”; Mr. G. A. MacIntyre, 
“Distribution-free comparison of two sets of data” and Dr. G. 8. Watson, 
“Distribution-free tests, similar regions and sufficient statistics’. 


“a 


NEWS AND ANNOUNCEMENTS 


The degree of Doctor of Laws was conferred on Harold Hotelling 
by the University of Chicago on 11 November, 1955, at the convocation 
in celebration of the twenty-fifth anniversary of the University’s Social 
Science Research Building. Dr. Hotelling was cited as the “foremost 
contemporary contributor of quantitative methods to the social sciences, 
who by mathematical analysis has notably advanced our understanding 
of fundamental problems in economics and statistics’. 

Dr. J. H. Bennett, who has been with Sir Ronald Fisher at the 
University of Cambridge, has gone to South Australia, to become 
head of the Department of Genetics, Adelaide University. 


Summer Session at the Massachusetts Institute of Technology 


The Statistical Summer Session formerly held at the University of 
Connecticut will this summer be held at Endicott House of the Massa- 
chusetts Institute of Technology during the weeks August 13 to 25 
inclusive. Two one-week programs will be offered. The first week 
will be under the chairmanship of Professor Leo Tick of New York 
University on the general topic ‘“Time Series’’; the second week under 
the chairmanship of Professor Max Woodbury of George Washington 
University with the topic “The Impact of Computers on Statistics’’. 
Requests for information and reservations should be sent to Dr. M. E. 
Terry, Bell Telephone Laboratories, Murray Hill, N. J., U.S.A. 


Annual M eeting of the Biometric Society (WNAR) 


The annual meeting of the Bometric Society (WNAR) will be held 
at the University of Washington in Seattle during the period of August 
22-24, in conjunction with national meetings of the Institute of Mathe- 
matical Statistics, the American Mathematical Society, the Mathe- 
matical Association and the Econometric Society. The Special Invited 
Address will be given by Professor N. Rashevsky, Chairman of the 
~ committee of Mathematical Biology at the University of Chicago. 
Another address will be given jointly to the Institute and the Society 
by Professor Harold Hotelling, University of North Carolina, on ‘Some 
new light on the multiple correlation coefficient’”’. There will be sessions 
on statistical problems in forestry, in medicine, and on prediction 
problems and possibly others. There will also be a session for contri- 
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buted papers. Abstracts of contributed papers and requests for accom- 
modations should be sent to the Program Chairman, Douglas G. Chap- 
man, Department of Mathematics, University of Washington, Seattle. 
A feature of the entertainment will be a salmon barbecue on Puget Sound. 


Meeting of the Association for Computing Machinery 


The annual meeting of the Association for Computing Machinery 
will be held on the University of California Westwood Campus, Los 
Angeles, August 27-29, 1956. (See January issue of the Journal of the 
Association for Computing Machinery). For information write G. W. 
King, Box 3251, Olympic Station, Beverly Hills, California, U.S.A. . 


Australian National University Research Scholarships 


Applications are invited from graduates for enrolment as research 
students in various Schools, including the Research Schools of Social 
Sciences and Pacific Studies, in which research may be done in Statistics, 
including Mathematical Statistics. Particulars and application forms 
may be obtained from: (1) The Australian Embassy, 2941 Massachusetts 
Avenue, Washington, D. C., U.S.A.; (2) The Australian Consulate- 
General, 206 Sansome Street, San Francisco 4, California, U.S.<A.; 
(3) The Australian Consulate-General, Room 426, International Build- 
ing, 636 Fifth Avenue, New York 20, N. Y., U.S.A.; (4) The Office of the 
Australian High Commissioner, 5th Floor, Royal Bank Chambers, 
100 Sparks Street, Ottawa, Canada. 


Australasian Region of the Biometric Society 


The Society will hold a general meeting at 8 p.m. 16th August, in 
the Arts Building, University of Melbourne, at which Dr. E. A. Cornish 
will give his Presidential Address. It will form a part of the Australian 
Mathematical Society Meeting. 

Internationales Biometrisches Seminar, 24. September —3 Oktober 1956 
u. Internationales Biometrisches Symposium, 1. Oktober — 3. Oktober 1956 
fiir die deutschsprachigen Gebiete. 

Veranstaltungsort: Linz a. d. Donau (Osterreich). Organisation: 
Prof. Dr. Arthur Linder, Genf (Schweiz), Avenue de Champel 24. 
Geschaftsstelle: Dr. Adolf Adam, Osterreichische Stickstoffwerke 
Aktiengesellschaft, Abt. UAW, Linz a. d. Donau, St. Peter 224 (Oster- 
reich)—Telefon Nr. 29181/Klappe 1508. 

Die Internationale Biometrische Gesellschaft, deren Zweck die 
Entwicklung, Anwendung und Verbreitung der quantitativen Methoden 
in der Biologie ist, veranstaltet in Linz a. d. Donau (Osterreich) ein 
Seminar tiber Planungsforschung fir Biologen, Biochemiker, Agrikul- 
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turchemiker, Mediziner, Pharmakologen, Pflanzenschutztechniker, Land- 
und Forstwirte. Das Seminar gliedert sich in eine Grundausbildung 
im Planen und Auswerten von Versuchen und Beobachtungen durch 
Vorlesungen, Ubungen, Aussprachen und Besichtigungen, sowie in die 
eingehende Behandlung aktueller praktischer Probleme. Die Ausbil 
dung wird durch internationale Fachkrafte in deutscher Sprache 
geboten. : 

Den Seminarbesuchern wird die Gelegenheit gegeben, an den 
Aussprachen im Rahmen des Internationalen Biometrischen Symposiums 
iiber Wachstums- und Ertragsgesetze (1. Oktober 1956), Transforma- 
tionen bei der Auswertung von Haufigkeiten und Wirkungskurven 
(2. Oktober 1956), Beurteilung der Wirkung von Heilmitteln bei chro- 
nischen. Erkrankungen (8. Oktober 1956), als Hérer teilzunehmen. 
Die teilnehmerzahl fiir das Internationale Biometrische Seminar wird 
auf 100 beschrankt. 

Die Gebiihr fiir die Teilnehmerkarte am Seminar wird § 150-, 
sfr. 25--, DM 25” betragen. Interessenten werden gebeten, ihre 
Anschrift zwecks Zusendung des Seminar programmes, von Anmelde- 
formularen usw. an die Geschaftsstelle: Dr. Adolf Adam, Osterreichische 
Stickstoffwerke Aktiengesellschaft, Abt. UAW, Linz a. d. Donau, 
St. Peter 224 (Osterreich) bekanntzugeben. 


EXTENSIONS TO MISSING PLOT. TECHNIQUES 


H. R. Tompson 


Applied Mathematics Laboratory, Department of Scientific and Industrial Research 
Wellington, New Zealand 


When several plots in an experiment have missing observations it 
often becomes very laborious to apply the simple method of giving 
preliminary estimates to all except one, calculating it by the formula 
for a single missing plot and proceeding iteratively through two or more 
cycles. The technique set out in this note permits the simultaneous 
estimation of missing observations in a randomised blocks experiment 
with r blocks and ¢ treatments, for the special case when there is one 
missing observation in each of n treatments (n < 1), distributed over 
p blocks such that n; are in the ith block. (The entirely analogous case 
of one missing observation in each of n blocks (n < r) is obtained by 
interchanging r and ¢ in the succeeding formulae.) More complicated 
cases, e.g., some treatments having more than one missing observation, 
can be solved iteratively, either by a combination of the single plot 
technique and the present technique or by applying the present tech- 
nique twice, depending on the distribution of the missing observations. 
The method is likely to prove useful when the data from part of one or 
more blocks is missing, due to reasons beyond the experimenter’s 
control. 

Let the missing observations in the ith block (¢ = 1, --- , p) be 
denoted by z;;(j = 1, --- , n;), the totals of existing plots in the treat- 
ments corresponding to these observations by 7%; , the total of existing 
plots in the ith block by B! and the grand total of existing plots by G’. 
The simultaneous estimation by least squares of the x;; leads to the 
equations 


fp Std ARN eth i p>) 4 aes, 
ete ies ae 53 os Linger 


i/=1 i/=itl fel 


rhe ery =O erly nee pg = 1 tas 
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which are, in matrix notation, 
jx =C, 


where x is the column matrix (z,;), C is the column matrix 
(rB; + tT); — G’), and 


A, Kise es Kip 
Ka A, a Koy 


Kak ees als 


in which A,isan; X n; matrix with elements a; = (r — 1) (¢ — ya in 
the main diagonal and b; = —(r — 1) elsewhere, 
K;; isan; Xn; matrix with all elements equal to k;; (k;; = 1 heey 
The solution of the normal equations may be shown to be 


x= aCe 
where 
aA ties ik tendons 
| Ki, Ag oot 
A = | * . 
Le Ke >> 


or which the A Rs are ale A- and eee ma: ie ; 
ee a oe aS ie 


4 
4 
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p= 3. Writing C,; = (r — 1)(t — n)), 
ay = [((C, +r — 1)(C.C3 — nons) 
— (ni — 1)(%C; + nC. — 2nenz))/D, , 
by = [@ — 1I(C2C3 — nyns) + (m2C3 + nsC. — 2nyns)]/D, , 
na — 1XCy — n,)/D; , 
is =e — 7 — OG, — n)/Dy, 


I 


I 


where D, = (r — Lé(Ci(C.C; — nens3) — m(mC3 + nC, — 2ngns)], 


and similarly for the other two sets by interchanging the suffices. 


EXAMPLE 


In an experiment on virus prevention in swedes (for the data of 
which the author is indebted to Mr. P. R. Fry, Plant Diseases Division, 
D.S8.1.R.) with 14 treatments and 6 blocks, there were 9 missing observa- 
tions, one in each of 9 treatments, distributed over 3 blocks as indicated 
in Table 1, in which the various treatment and block totals are also 


TABLE 1 
Missing Plot Calculations 


Treatment T's;' B;’ G’ 6B,’ + 147;,;’ — @’ Lej 
10 298.9 666.8 3697.5 4487 .9 62.1 
2 169.7 566.8 ss 2079.1 39.2 
11 215.1 ff ee 2714.7 48.3 
13 2119 ¥ 2669.9 47.6 
14 280.4 ee oh 3628.9 61.3 
1 392.1 549.8 ie 5090.7 87.4 
3 221.4 . # 2700.9 _ 53.2 
4 239.7 a e 2957.1 56.9 
12 250.7 te a lee 59.1 


given. The original data gave the percentage infection in each plot, 
and these percentages have been transformed to equivalent angles. 
Substituting r = 6,¢ = 14, m = 1, nm. = ns = 4 in the equations for 
p = 3, we find 
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- PTT 44345 —805 —805 -—805 -—805 805 805 —805 —805 
—805 44406 4133 4133 4133 1120 1120 1120 —1120 
—805 4133 44406 4133 4133 1120 1120 1120 -1120 
-805 4133 4133 44406 «4133 —1120 1120 1120 —1120 
805 4133 4133 4133 44406 —1120 —1120 —1120 —1120 
—805 —1120 1120 —1120 -1120 44406 4133 4133 4133 
-805 —1120 -1120 1120 -1120 4133 44406 4133 4133 
—805 —1120 -1120 1120 -1120 4133 4133 44406 4133 


—805 —1120 —1120 —1120 —1120 4133 4133 4133 44406 


The calculations can be quite easily systematized to obtain the z;; , 
and these are given in the last column of Table 1. It is interesting to 
note that if a preliminary estimate of the first missing value is made 
and the other 8 calculated using p = 2, the values obtained after one 
cycle are all within 0.1 of their eotteet™ estimated values, whereas if 
the solution for p = 1 is used and preliminary estimates made of the __ 
first five, two cycles are required to obtain the correct values and the 
calculations are more laborious. , 


THE ANALYSIS OF A3 xX 6 EXPERIMENT ARRANGED IN 
A QUASI-LATIN SQUARE 


G. E. Hopnerr 
Rothamsted Experimental Station, Harpenden, Herts., England 
es 


Introduction 


A factorial experiment on Star Grass (cynodon sp.) to investigate the 
effect of three frequencies of cutting and six levels of nitrogen was 
commenced in 1953, at the Grasslands Agricultural Research Station, 
Marandellas, 8. Rhodesia. The design was derived from that given by 


TABLE 1* 
Design, Field Plan and Yields 1953/54 
[Dry matter: lb. per plot (1/200 acre)] 


Co J3 Ji Ji Jo J» athe 
Cy a J» J3 Jig Ji Ji Total 
I; q; 200 221 120 111 010 001 
3.5 28.6 19.3 29.9 ie} 48.6 147.7 
IT, TI; 221 210 111 100 O01 020 
31.9 O77, 30.5 3.5 38.4 34.0 148.0 
Ty T3 020 O11 210 201 - 100 121 
31.0 51.4 9.1 (22.9 2.2 30.3 146.9 
q, TI, 110 101 000 021 220 211 
10.5 23.2 13.0 55.4 16.2 29.7 148.0 
IT; Ts 101 120 021 010 211 200 
24.3 21.8 aves 17.4 24.5 4.3 149.4 
vt I; 011 000 201 220 121 110 
42.3 4.4 23.5 16.0 41.6 16.7 144.5 


Total} 143.5 1991. | see abla 140.7 163.6 884.5 
aD ea all as lle ie Mh Fee See le da le eee 
200 = axboco , 221 = arbot,, etc. 


*The I and J components of the interactions AB and ABC are partially confounded respectively 
with the rows and columns of the square, the I’s and J’s above pe the diagonal sets of combinations 


of a and b as defined in Table 2 (iii). 
j 245 
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Yates [1] fora 3 X 3 X 2 experiment (factors A, B, C) arranged in a 
6 X 6 quasi-Latin square, by formally identifying the six levels of 
nitrogen with the six combinations of the pseudo-factors B and C as in 
Table 2. The design, layout and yields in the first season are shown in 
Table 1. 

(Treatments: Frequency of cutting (a): 2, 3, 6 cuts at regular intervals 
over a_period of 18 weeks during the rainy season. Nitrogen (6 and c): 
0, 500, 1000, 1509, 2000, 2500 lb per acre sulphate of ammonia (219% ,N): 
each level being applied in sixths at intervals of 3 weeks. Basal dressing: 
400 lb per acre superphosphate with the first dressing of nitrogen.) 

As there was considerable interest in the interaction between the two 
factors, the two-way table of results was required free from row and 
column effects, together with the appropriate standard errors. In order 
to make the adjustments, it was necessary to evaluate the formal 
three-factor interaction, since it was a component of the interaction 
between nitrogen and the frequency of cutting. In the course of this 
analysis, a general method was developed for determining the standard 
errors of comparisons between adjusted yields. This note puts on record 
the methods used in the analysis of this unusual design. The experiment 
is being continued and the agricultural implications of the results will 
be discussed elsewhere [2]. 


Analysis 


The computations were set out asin Table 2. The sums of squares for 
rows and columns (ignoring treatments) and for the main effects of the 
two factors were computed in the usual way from the marginal totals 
of Tables 1 and 2 (i) respectively. The table of sums (c, + c)) was the 
starting point for computing the effects of A, B and AB while the inter- 
actions of these effects with C were computed from the table of differ- 
ences (¢; — ¢)). For the main effect of C, the sum of squares was 
obtained from the total of the latter table as (383.7)°/36. As a check, 
the total of the sums of squares for the components B, C and BC was 
compared with the sum of squares for nitrogen already obtained. The 
linear effect of nitrogen N’ was also evaluated, as the nitrogen totals 
indicate a very marked linear response. 

For the interaction AB, the estimate of the components and the 
corresponding sums of squares were computed as in Table 2 (iii). The 
ordinary J and J totals of the (c, + co) table were adjusted by means of 
the row and column totals (R, --- Rs, C, --+ Cs) in order to eliminate 
the effects of rows and columns respectively. These expressions for 
the adjusted totals (J’, J’) may be derived by the standard method of 
fitting constants (Yates [3], Nair [4], Kempthorne [5]). Substitution 
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of the constants gives 


21; 


244,+ 67.+ 623 
62, + 247, + 673 


213 
213 = 64+ 62, + 24i; 


which also verifies that the J’ are free from row constants. Although 
the J’ are correlated, differences between them are proportional to the 
differences between the constants. For example: 


2(i — Ig) = 18(4, — 1) 
so that 


; : 1 
See als) 


Thus the estimates of the J component of AB (2 d.f.) are given by the 
deviations of the J’ from their mean divided by 9 (computed as A2I’/18). 
In the absence of confounding the corresponding estimates would be 
AI/12, so that the partial confounding has reduced the ‘effective’ 
replication from 12 to 9 i.e. by a factor of 3/4 (the relative information). 
The normal divisor for the sum of squares (here 12) is also modified 


TABLE 2 
Computations 
(i) Treatment totals 
Nitrogen 
Frequency No m1 ne Ns M4 ns 
of pt Total 
cutting Co Co Co C1 to C1 
Bo bi be bo bi be 
ao 17.4 35.2 65.0 87.0 93.7 | 112.5 -| 410.8 
a 5.7 27.2 41.1 47.5 60.4 71.9 | 253.8 
2 7.8 18.8 32.2 46.4 54.2 60.5 | 219.9 
Total 30.9 81.2 | 138.3 | 180.9 | 208.3 | 244.9 | 884.5 
Linear com- 
ponent NV’ 15) —3 -1 +1 . +3 +5 
@ 
8.8. for V’ = ae (—5 X 30.9 —3 X 81.2... +5 XK 244:9)? = 5313.66 


aCe 0 
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(ii) Two-way tables 
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Frequency G1 + Co C1 — Co 
of 
cutting bo bi be Total bo by be Total 
ao 104.4 | 128.9 | 177.5 | 410.8 | 69.6 | 58.5 | 47.5 | 175.6 
ay 53.2 | 87.6 | 113.0 | 253.8] 41.8] 33.2] 30.8} 105.8 
a2 54.2 | 73.0 | 92.7 | 219.9] 38.6 | 35.4] 28.3 | 102.3 
Total 211.8 | 289.5 | 383.2 | 884.5 | 150.0 | 127.1-] 106.6 | 383.7 
(iii) From cy + co 
Component AI 4 = AI/l\2 
i = 104.4 + 87.6 + 92.7 = 284.7 —10.1 —0.84 
I, = 58.2+ 73.0 + 177.5 = 303.7 ae est) +0.74 
Iz = 54.2 + 128.9 + 113.0 = 296.1 + 1.3 +0.11 
Total 884.5 = — 
Mean 294.83 = = 
Component AJ j = AS/12 
J, = 104.4+ 73.0 + 113.0 = 290.4 — 4.4 —0.37 
Jo = 58.2 + 128.9 + 92.7 = 274.8 —20.0 —1.67 
J3 = 54.2+ 87.6 + 177.5 = 319.3 +24.5 +2.04 
Total 884.5 = = 
Mean 294.83 = = 
Adjusted component A2I’ Da ADT OAS 
21,’ = 21, + Rk; + Rs = 868.7 —18.8 —1.05 
21,’ = 21, + Ri + Re = 899.6 +15.1 +0.84 
2I;' = 21; + R. + Ri = 888.2 + 3.7 +0.21 
Total 2653.5 - ~~ —_ 
Mean 884.5 = = 
Adjusted component A2J’ jy = A2J'/18 
2J1' = 24, + Ci + Cy = 869.4 =—15.1 —0.84 
2J2! = 2J2 + C3 + Cy = 865.7 —18.8 —1.04 


Total 
Mean 


2653.5 
884.5 


a ee Se 


1 
8.8. for AB = 36 (18.8? + 15.1% + 3.72 + 15.12 + 18.8? + 33.9%) = 64.61 
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(iv) From c, — ¢o 


Component AH h = AH/12 
HA, = 1381.1 +3.2 +0.27 
Hz = 124.7 —3.2 —0.27 
H; = 127.9 0.0 0.00 
Total 383.7 — pas 
Mean 127.9 as = 
Component AK k = AK/12 
K, = 135.8 +7.9 +0.66 
Kz = 128.6 +0.7 +0.06 
Ks; = 119.3 —8.6 —0.72 
Total 383.7 — = 
Mean 127.9 aan — 
Component A2H’ h' = A2H’/6 
2H,’ = 2H, + Ri — R. — Ri + Re = 259.0 +3.2 +0. 53 
2H,’ = 2H, + Rk. + BR; — Ry — Rs = 246.9 —8.9 —1.48 
2H,’ = 2H; + Ri — R3 + Rs — Re = 261.5 +5.7 +0.95 
Total 767.4 — — 
Mean 255.8 = as 
Component A2K’ ki = A2K’/6 
2K,’ = 2K, + C.+C; — C; — Cs = 258.9 +3.1 +0.52 
2K, = 2K. — C; — (2 + Ca + Cs = 260.4 4.6 +0.77 
Dire — 2K; os C, ——- C3 aed Cc. a Cs a 248.1 —7.7 —1.28 
Total 767.4 — — 
Mean 255.8 os =<. 


8.8. for ABC = = (3.22 + 8.97 +5.727+ 3.12? + 4.6 + 7.7) = 17.67 
by this factor. Thus the sum of squares for AB (J) is 
ir IN Eo am : a 
9 S(AT’) = 36 S(A2I") 
which is the form most suitable for computations. The J component is 


treated similarly. . 
Section (iv) of Table 2 shows the corresponding computations for the 
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formal three factor interaction ABC, where H and K are the J and J 
totals of the (c, — co) table. In this case, for example 


phe = (Hi ~ Ht!) 


whereas in the absence of confounding 


lon 
h, = hy = g A — H,) 
so that the relative information on ABC is ($)/6 = 4. By the rule 


used above, the sum of squares for ABC (J)(2 df.) is given by 


1 


Ne es ; 1y\2 
Goa 719 S(A2H’) 


TABLE 3 
Analysis of Variance 


Source of variation IDE Sum of squares Mean square 
Rows 5 2.24 
Columns 5 70.45 
Frequency of cutting 2 1728.92 864.46 
Nitrogen 5 5395.80 1079.16 
poe (N’) 1 5313.66 5313.66 
remainder 4 82.14 
Interaction 10 367 .20 
AB 4 64.61 
AC 2 284.92 
ABC 4 17.67 
Error 8 202.27 25.28 
Total 35 7766.88 


8.E. per plot, s = +£5.03 or 20.5% 


The complete analysis of variance is shown in Table 3. Both the 
main effects were clearly significant, the sum of squares for nitrogen 
being almost entirely due to the linear component. Although the inter- 
action mean square (10 d.f.) did not reach significance, the magnitude 
of the AC component indicates that there was some effect. From the 
table of adjusted mean yields (Table 4), the response to nitrogen is 
readily seen to have been reduced as the frequency of cutting was 


increased. The nature of this interaction is examined in more detail 
later. 
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TABLE 4 
Treatment Means Adjusted for Confounding 


Nitrogen 

Frequency no ny No Ns 4 N5 

of Mean 
cutting Co Co Co Cy Cy Cy (+1.45) 

bo by bo bo by be 
a 7.9(a) | 16.7(6) | 34.2(c) | 43.0G) | 49.2(k) | 54.4(0) 34.2 
ay 4.1(d) | 13.5(e) | 19.4(f) | 24.0(m) | 29.5(n)| 36.4(0)| 21.2 
az 3.5(9) 10.4(A)} 15.6(2) | 23.5(p) | 25.4(g) | 31.7(r) 18.3 
Mean (+2.05) | 5.2 13.6 23.1 30.1 34.7 40.8 24.6 


The adjusted treatment means (Table 4) were obtained using the 
following formula: 


Yate = Yate +2’ +7’ —t— 7 
(i +k’ sh—b 


where the required adjustments are determined by the J and J sets to 
which the treatment combination belongs; the plus sign is used when the 
treatment is at the c, level and the minus sign when it is at the c level. 
Consider, for example, the adjustment of the mean for ajboco . This 
treatment belongs to the sets J, and J, and is at level c. . Thus from 
Table 2 ; 


Groboco = ue =91,05 —.0.84-+:0.84 -6 0.37 
— (0.53 + 0.52 — 0.27 — 0.66) = 7.9 


Computation of standard errors of comparisons between adjusted treatment 
means 


Because of the loss of information on some of the components of the 
interaction due to the confounding, the standard errors of comparisons 
within the body of Table 4 will be increased by factors which depend 
on the particular components of interaction involved. The standard 
error of any desired comparison may be computed quite simply using 
the scheme shown in Table 5 which will be described step by step. 
The underlying theory is given in the appendix. ~ 
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The 17 orthogonal degrees of freedom representing main effects and 
interactions of the factors A, B, C are written down in terms of the 
treatment combinations, which are labelled a, b, c --- q, r in Table 4. 
For example, the linear effect of B i.e. B’ is 


(@totrt+te+f+i—-(G+_m+pt+at+d+gQ); 
the effect of the other pseudo factor C is 
Gtkt--4+)-@+b4--49 


and so on for the quadratic components and the interactions. The 
relative information (R.I.), given by Yates [1], on each of these contrasts 
is also noted. The sums of squares (S.8.) of the numbers in each row 
are then tabulated (Table 5). 

If the variance of a simple difference between two mean yields, e.g. 
(q — a) is required, then the numbers in the column headed ‘a’ are 
subtracted line by line from those in the ‘q’ column. The results are 
shown in the column headed (g — a). The ares by which the normal 
variance of a treatment mean (here 25.28/2, since there are two repli- 
cates) must be multiplied is then given by the sum of the quantities 


(¢—"a)" 
SSR) 
which are computed line by line. For the difference (q — a) this factor is 
(oy CM CM) 
(ta. AS Ce) £ 
+( Sau 72) *%3 


1 Ce 
ee ie D 


so that var (¢ — a) = (34/9)s’/2 = 47.75. Asa check the factor becomes 
2 for a comparison in which the relative information is unity for all the 
orthogonal contrasts involved. 

The variances of the different types of comparison between pairs of 
treatment combinations are given in Table 6 together with examples and 
the numbers of each type. The average variance of the comparisons can 7 
then be computed if required and hence the standard error which can be 
assigned to the treatment means. In experiments where there is little 
loss of information the average standard error will be satisfactory for 


most purposes. 
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TABLE 6 


Variances of different types of comparison between pairs of treatment combinations 


Type of comparison Example No. Variance s? 
Factor levels differing 
A d—a 18 432 
B b—a 18 At 
A and B h—-—d 36 at 
C W—e 9 oh 
C and A r—e 18 43 
C and B k-a 18 43 
Cand A and B q—a@ 36 a 
Total or mean 153 13 X 21 
17 xX9 


As the response to nitrogen was substantially linear, it was decided 
to examine the interactions of the other factor with this component 
ie. A’N’ and A’’N’. The values of these components computed from 
Table 4 were —2.07 + 0.60 and +0.50 + 0.35 respectively in units of 
a single plot. The interaction A’N’ is thus significant. 

The computation of the standard errors of these components provides 
an example of how the above method can be readily extended to cover 
all types of linear comparisons between the adjusted treatment means. 

Writing A’N’ in the form 


Ka tr—g—-D+38b+q-h-B+e€+p-i-Z 


the value of each term may easily be computed line by line (as for 
(q — a) above) and afterwards combined. The results are shown in 
the last column of Table 5. Applying the above rule 


ee, 16” 4 ; 2! 

var (A) = (2F x 44 88 x 1) x 
Mg ivie: 
sane es 


If the fraction 4/3 is replaced by 1, the variance becomes 70 X s” as it 
should be if there was no confounding. Similarly var (A’’N’) is found 
to be 452s”/2. 

This is actually a rather simple case since only two orthogonal 
contrasts are involved. It is easy to verify that 
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iw) 
or 
or 


N’ = 2B’ +3¢ 


so that 

A’N’ = 2A'B’ + 3A'C 
and 

A''N’ = 2A"”B’ + 3A”"C 
Hence 


var (A’N’) = 4 Var (A’B’) + 9 Var (A’C) 


(axsx$4+ox12x1)x& 


= ae x s’ as before. 


This example illustrates the principle underlying the above procedure: 
the required comparison is expressed linearly in terms of the 17 ortho- 
gonal contrasts, whose variances are readily available. Since the 
contrasts are orthogonal, the required variance is obtained as the sum 
of the variances of the separate terms. 

The method can also be readily applied in the computation of stand- 
ard errors of comparisons in all types of split-plot experiments. The 
error mean square divided by the relative information in the confounded 
experiment is merely replaced by the appropriate error mean square 
(on a sub-unit basis) from the split-plot analysis. 
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Summary 


The analysis of a 3 X 6 factorial experiment arranged in a 6 X 6 
quasi-Latin square is described and details are given of a general method 
for computing the standard errors of comparisons between treatment 
means adjusted for confounding. The application of this method to the 
computation of standard errors of treatment comparisons in split-plot 
designs is indicated. 
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APPENDIX 


(a) The problem is to express any given comparison as a linear 
function of the orthogonal contrasts, i.e. we require the values of the 
coefficients A, B --- Q, R, of these contrasts. 

Using matrix notation and referring to Table 5 let 


a A G = the 18 X 18 array of 
h B coefficients defining 
the orthogonal contrasts 
x= w= 
q Q 
r R 


Le. x is the vector of treatment combinations and wu is the vector of 
coefficients of the orthogonal contrasts. 

The set of orthogonal contrasts: A’, A’, B’ --- (including the total) 
is given by 


Gx (1) 


Now let ¢ be a column vector of coefficients defining that comparison 
whose error is required, so that the actual comparison is given by 


t’x (2) 


In order to express this contrast as a linear function of the A’ pds 


etc., each of these is multiplied by a coefficient A, B, C --- Q, R and 
identified with ¢’x. : 
Thus 


u’Gx = 


| 
ae 
rs 


(3) 
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and so, comparing coefficients of x 
wG=t (4) 


Since the rows of @ are orthogonal, GG’ is a diagonal matrix, and 
hence the inverse of GG’ is also a diagonal matrix whose diagonal 
elements are the reciprocals of the corresponding elements of GQ’. 
The diagonal elements of GG’ are given by the sums of squares of the 
elements in the corresponding rows of G; they are given in the column 
headed 8.8. in Table 5 and may be denoted by (SS)4, (SS)p --+ (SS)r- 

Postmultiplying (4) by G’ gives 


0GG?= 7G’ =d, say 
where 
d a (d4 dz iN 3 dp) 


Hence 


u’ = ( ds ds 2 de os) (5) 
(SS), (SS)\p° SSp/ 
which is the required solution. Tae 
(b) We now require the variance of the popes given by Ux: 
var (t’x) = var (u’Gx) from (3) eft 
= w’ var (Gx) u 


_= B var A’) + C’ var (A’’) foieee tH R? var (A"B’'C) (6) 


since var (Gx) is a diagonal matrix, the contrasts Gx being orthogonal. 
Now the variance of any contrast X(X = A’, A”, B’ bee SNH 
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t’ = (-100000000000000010) 
d =t’G’ = (0201 —32 —1 —11 —302 —1 -11 —3 —1 —1) 


i.e. the elements of d are obtained by subtracting row by row ‘column a’ 
of Table 5 from ‘column q’; these elements are given in the column 
(q — a) of Table 5. Substitution in (8) immediately gives var (q- aie 
(34/9)s°/2 as shown in the text. 

(2) For the comparison A’N’ 


t = (531000 —5 —3 —1 —1 —3 —5000183 5) 
d =t/G’ = (000000 16000360000000) 
Then _ 


uw’ = (0000007 000130000000) 


using (5) and so A’N’ = 2A’B’ + 3A’C as quoted. Var (A’N’) then 
follows using either (8) or (6) and (7) as shown in the text. 


A NOTE ON THE 4" SERIES OF FACTORIAL EXPERIMENTS 


P. J. CLARINGBOLD 
Depariment of Veterinary Physiology, University of Sydney, 
N.S. W., Australia 


In the initial stages of an experimental investigation one of the 2” 
series of factorial experiments is ideal for quick evaluation of important 
factors and interactions. In subsequent and more detailed studies of 
the response surface, designs involving more than two levels of factors 
are often required. The principles involved in fractional replication 
and confounding of factorial experiments are well established [Yates, 
1937; Fisher, 1942, 1945; Finney, 1945, 1946; Kempthorne, 1947, 1952; 
Cochran & Cox, 1950] and a considerable number of examples of frac- 
tionally replicated or confounded 2” or 3” factorial experiments are 
available for study. Few examples are to be found of factorial experi- 
ments in which the number of factor levels is a power of a prime. One 
example is given by Kempthorne & Tiseher-[1953] and is a 1/4 replicate 
of a8 X 4° X 2° split-split-plot factorial experiment. In the present 
paper a symmetric example of the 4” series is discussed. 


4° Experiment in Quarter Replicate 


The factors are denoted A, B, C, D and E, and the levels of each 
factor numbered 0, 1, 2 and 3. Corresponding with each factor we 
define a pair of pseudofactors each at two levels 0 and 1. Thus the 
four levels of each factor correspond with the pseudofactors thus: 


Level of A Levels of pseudofactors 
AG All 
“0 0 0 
iR 0 it 
2 1 0 
3 1 1, 


ee ee eee Le 


and likewise with the other factors. 
Any treatment combination may be represented by the binary 
number, 
U1Xq , L3Xq » Usle , L7Xg , Lolio » 
259 
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where x, = 0 or 1. Successive pairs of digits correspond with different 
factors. Thus the treatment combination specified by 00, 01, 10, 11, 11 
represents the Oth level of factor A, Ist of B, 2nd of C and 4th of both 
D and E. 

The identity relationship chosen as the basis of a quarter replicate 
of this design is:— 


T a ALBLOCLD Be — AYU BUG Deh = AAU BBY OCED DEES (1) 


It is readily verified that no main effects or first order interactions 
are mutually confounded in this design. It must be remembered, 
however, that an interaction such as A’A’’B’B” although a third order 
interaction between pseudofactors is a first order interaction between 
factors. Successive multiplications of equation (1) yield:— 


A! = BIC'D'E! = A'A"B' CDE" = A" BIBMC'C"D'D''E'E” (2) 
A'A" = AUB'C'D'E! = A'B'C"D"E" = B'B'C'C"D'D"E'E" (3) 
A'A"B! = ANC'D’E’ = A'BIB'C"D"E" = B'C'C'D'D"E'E” (4) 
A'A'B'B" = A'B'C'D'E! = A'BIC"D"E" = C'C"D'D"E'E" (5) 
A'B! = C'D'E' = A'A'B'B!D"E" = A"B"C'C'’D'D"E'E” (6) 


This shows that at most in (6) a first order interaction is confounded 
with a second order interaction. Similar equations hold for other 
pairs of factors owing to the symmetry of the design. Working out 
the details of the alias relationship in asymmetrical cases is tedious 
unless card sorting techniques are available [see Kempthorne & Tischer 
1953]. 

After defining a suitable alias relationship the treatment combina- 
tions admissible to the quarter replicate must be listed. In the present 
example it is not very time-consuming to carry out this by hand. 
Treatment combinations corresponding to each quarter replicate are 
specified by: ; 


Sum z; Sum 2; 
Quarter replicate += 1,3,5,7,9. | j = 2, 4, 6, 8, 10. 


First Even Even 
Second Even Odd 
Third Odd Even 
Fourth Odd Odd 


SS ene nnce 
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If it is decided to use the first quarter replicate defined above then 
treatment combination, 


01, 01, 01, 01, 11 
would be excluded since Sz; is odd and Sz; is odd. On the other hand 
combination 


00, 00, 00, 11, 11 


would be included since Sz; is even and Sz; is even. 

The first quarter replicate was used in a study of the effect of oestro- 
gen on the mitotic rate of the vaginal epithelium of the ovariectomized 
mouse (see Claringbold, 1956, for a full account of the work). The 
analysis of variance presented no unusual problems, having the form:— 


Source of variation D.f. 
Five main effects Deora 5, 
Ten first order interactions 10 xX 9 = 90 
Remaining interactions as error 150 
Total 255 
DISCUSSION 


It is easy to see that the 4° factorial experiment is the smallest of 
the series which may be quarter replicated and yet leave interactions 
of less than three factors unconfounded: for confirmation check with 
equation (6) and ignore factor H, or alternatively consider the degrees » 
of freedom (d.f.) in the 4* experiment. Main effects require 12 d.f. 
and first order interactions 54, a total of 66 which exceeds the number 
of observations in the quarter replicate, Le. 64. 

The smallest experiment which may be 1/16th replicated is_the 
4’ with 1,024 treatment combinations. The 4° experiment may be 
quarter replicated using a similar alias relationship to (1) with the 
addition of F’, F’’ and F’F” to the second, third and fourth terms 
respectively. The 1,024 treatment combinations of both experiments, 
however, are probably too extensive for most experimenters. An 
admissible 1/16th replicate of the 4° experiment is not possible. It 
thus appears that the 4° experiment is the only one of the series which 
may be fractionally replicated unless further assumptions can be made 
about the main effects and first order interactions. 
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The three d.f. for each main effect may be reduced in many ways 
to sets of orthogonal single d.f. contrasts. One common method is 
to separate the sum of squares corresponding with the three d.f. into 
linear, quadratic and cubic components of regression. It may be 
possible to assume that the cubic component together with its inter- 
actions is negligible. For any factor at four levels we may compute 
the sum of all responses at each level. If these treatment totals are 
denoted 7, to 7, we may make the following orthogonal contrasts 
between them, 


Al eT, Fy eT +L 
Av’ = —-T7,+7,-—T, +7, 
Ae = AALS = Ab ae fi =F ff = lve 


The orthogonal contrasts giving the linear, quadratic and cubic com- 
ponents of regressions are respectively, 


L=2A'’+ A” = —37, —T,+ 173 + 37, 
Oe A eT, =e Ts 


C= 2A" — A’ = —T; + 37, — 37, + 7%. 

It will be seen that the first set of contrasts is the set we would obtain 
if the four totals are considered the result of a 2” experiment with the 
contrasts A and A” isolated as main effects and A’” as the interaction. 

If it is possible to regard the cubic component of regression as 
negligible, and if the contrast A’’ is confounded then A’ gives an estimate 
of the linear component of regression and A’”’ of the quadratic com- 
ponent. It is thus possible to confound one of the main effect d.f. of 
the factor and its interactions with other factors. A design of this type 
is of use in optimum condition studies where the peak of a response 
surface is being sought, provided that the surface is no more complex 
than the second degree. With more severe assumptions of this type 


many more possibilities of fractional factorial experiments are prac- 
ticable in. the 4” series. 
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SOME SMALL SAMPLE TESTS OF SIGNIFICANCE FOR A 
POISSON DISTRIBUTION 


C. RADHAKRISHNA Rao AND I. M. CHAKRAVARTI 
Indian Statistical Institute, Calcutta 


0. INTRODUCTION 


The large sample tests associated with Poisson distributions such as 
(i) goodness of fit, (ii) homogeneity of the observations, (i.e. arising 
from the same Poisson distribution), (iii) deviation in the frequency 
of zero etc., are not valid if the Poisson parameter is small unless the 
sample size is extremely large. In fact the chi-square approximation 
for the ‘index of dispersion’ (variance test for homogeneity) seems to 
depend largely on the magnitude of the Poisson parameter rather than 
on the sample size. Sukhatme’s (1938) model sampling results indicate 
that the sample size need not be large provided the Poisson parameter 
is not small. The values of the parameter chosen by him are, however, 
all greater than unity and it is not known what happens for smaller 
values. 

Fisher (1950) provided an exact treatment of the goodness of fit 
and variance tests by considering the conditional probability of the 
observations given their total, which in view of the sufficiency of the 
total is independent of the unknown Poisson parameter. The present 
paper is in a sense a follow-up of the technique given by Fisher. 


A number of problems have been investigated: 


(i) A test based on the likelihood ratio is suggested as an alternative 
to the variance test of judging the homogeneity of different observations 
or more specifically whether the observations arise from a single Poisson 
population and not a compound Poisson distribution. This test seems 
to be better suited for this purpose than the variance test in small 
samples*. 

(ii) An exact test of deviation in the ‘zero’ frequency or in general 
for any other frequency is given. The large sample test given by Cochran 
(1954) may not be valid for small values of the Poisson parameter when 

*It might be of interest to point out that the suggestion of using a likelihood instead of x? is quite 


old. [Fisher (1922), Neyman and Pearson (1928), Cochran (1936).] But the present application seems 
to be new. : 
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the sample size is not very large. The exact test it may be noted is 
independent of the unknown parameter. 

(ii) Exact tests for goodness of fit and homogeneity have been 
worked out for the Truncated Poisson distribution. The appropriate 
index of dispersion test for the Truncated case has been given. This 
differs shghtly from the treatment of David and Johnson (1952). 

Limited tables have been provided for the application of a number 
of tests associated with the Poisson distribution. 


1. HOMOGENEITY TESTS 


1.1. The index of dispersion 


Ub ae er er oe , x; are f observations from a Poisson population 
then it is known that given the total 


(US oy he me ag ee we 


which is sufficient for the Poisson parameter, the conditional probability 
of the observations is (see Rao, 1952, pp. 36-37) 


T! (2). 
a wae Ll 
Piha eS wep 
which is multinomial with equal probabilities in f cells. To test the 
hypothesis that z, , x, , --- , 2; arise from Poisson populations with the 


same parameter we need only examine whether a total number of T - 
observations from a multinomial distribution with equal probabilities 
could produce the frequencies 2, , 22, -*: , %;. The classical x” test 
based on (f — 1) degrees of freedom for this is 


> {a; — E(x; | T)}? 


E(; | T) 
which reduces to 


& 

where € = E(x; | T) = T/f. This is referred to as the variance test. 
For the validity of the x” approximation it is generally required that 
the expected frequency 7'/f should be greater than 5 but it appears 
that in the above case T'/f could be much smaller than 5 as found by — 
Sukhatme (1938) in his model sampling investigation. To investigate 
the nature of the approximation it is proposed to obtain the first four 
moments of the variance statistic subject to the condition that T and 
f are fixed. 
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The conditional expectation of any function ¢(a , f2, °** , %,) 18 
connected with its total expectation by the following relation, where 
p is the common mean of the individual Poissons 


Sa re eC)" — 1) 


or 


> we |) LA = on) 
Hence 
E@|T) = = {Coeff..of a’ ine" H(@)}. 
Therefore knowing the total expectation, the conditional expectation 
can be easily found. We can consider the statistic S = )) x; — fz 


whose moments are known functions of » and then derive its 1 ee 
moments. For instance 


! 
E(S|T) = 7 fCocticient connie ar 


” AAG = » am D a, 


mee 


POISSON DISTRIBUTION 267 


[ [487 — 1) + 12(¢f — 17] 


w(D) = 4 (a= a - r) ee 1) 
ee Astp = ty 8 =) f= 5) (1 ss rl a 2) ae 
+a - 1 CHAE (; _ 2) 
oe ae 
B.(D) — 3 = os (Sah CO Re Je 168 — 5f + 15) at 
eee eke eon 8) - Ce 48f — 144) : 


It may be noted that the expressions obtained above refer to x’-goodness 
of fit of a multinomial with equal probabilities in f cells and.a total of 
T observations. For this case the expressions up to up, have been derived 
earlier by Haldane (1937) using a different method. Our object in 
reproducing these expressions is two-fold. Firstly, as an illustration 
of a simple method proposed for their derivation. Secondly, to provide 
a direct comparison of the exact moments and the #-functions with 
those of the x’-approximation. 

The rapidity with which the above £-criterion tends to zero depends 
on the value of 7'/f and to some extent on the size of f as well. Numer- 
ical evaluations of the criterion and the comparison of actual and 
approximate probabilities suggest that it may be misleading to use 
the x’-approximations if 7'/f is less than unity and that the approxi- 
mation yields good results for 7T'/f over 3. Where exact probabilities 
are needed one has to use the multinomial probability (1.1) and cumulate 
the probability for values of D equal to or larger than the observed. 
Tables given at the end of the paper contain these expressions. 


1.2. Likelihood ratio test as an alternative 


If the observations arise from different Poisson populations then 
the likelihood of any set of parameters pw; , 2, °°* , My 1S 


ohn)" 
I] xt;1 
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which attains the maximum 
Gon 10 
while the maximum likelihood for a common yp is 
eit) eo bie 
Hence the log likelihood ratio test for detecting heterogeneity is 
lies —)>> 2; log. x; — T log. f + T log. T 


If 7'/f is large, —2Ly is approximately x” with (f — 1) degrees of 
freedom. According to general theory the statistics Ly and D are 
asymptotically equivalent and there is not much to choose between 
them when 7'/f is large. Probably the same remarks about the size 
of T/f and validity of the x° approximation as in the case of D hold 
for Ly also. But in small samples and/or when 7'/f is small indicating 
that an exact test is preferable, then the L, test seems to have certain 
advantages. The D statistic tends to be heavily grouped in small 
samples so that the D with a cumulative probability less than or equal 
to 5% may actually correspond to a much lower level of significance 
because of the gaps in D. With Ly, much closer percentages are ob- 
tained and consequently it has better chance of rejecting the null 
hypotheses. Ly, is preferable to D for discriminating a compound 
Poisson distribution from a simple Poisson. 

As we are using conditional tests the statistic may be simply defined 
by Lg = os x; log, x; , Just as in the case of D we may have D’ = > La 
Fisher (1950) gives the following observed distribution: 


Variate 0 1 2 3 Total 
Frequency . 124 12 2 2 140 


xi 
az. 


The values of the statistics D’ and Lj are 
D’ = > #f, = 38 
Li = > fii log, 1) = 9.364362 


The cumulative probability using the exact expressions for D’ is .001112 
while for Ly it is .0005732 indicating that it is more sensitive. 


ll 


Il 


2. TEST FOR DEVIATION IN THE ‘ZERO’ FREQUENCY 


2.1. Large sample tests 


Situations arise in experimental investigations where the observa- 
tion ‘zero’ from a Poisson population is. under-represented or over- 
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represented. When its frequency is unascertainable or when it is not 
considered we have the case of a truncated Poisson. It is therefore 
reasonable to examine whether the zero frequency is the offender when 
a goodness of fit test, such as x’, shows a departure of the observed 
distribution from what is expected on the basis of an underlying Poisson 
distribution. A natural approach to this problem is to decompose the 
total departure into two components, one due to the zero frequency 
and the other due to the rest of the frequencies. Both are important 
because of our interest in also examining whether the rest of the dis- 
tribution can be accounted for by a truncated Poisson. Departures may 
exist in both. This can be accomplished in large samples by a partition 
of the total x°*. If x; is the total x” with x? for the truncated Poisson 
then the relation 


X =x +x: 
supplies the x: with one degree of freedom for detecting significant 
departures in the zero frequency. 


David and Johnson (1952) provide the following interesting dis- 
tribution of the number of decayed teeth. 


Number of decayed teeth O 1 2 3 4 5 67 8 
Frequency of boys 61 47 43 35 28.15 20 5 5 


9510-115 12 
Bogle 2 


Estimating the value of u for the Poisson by the formula 


er ig ne = 2.5736 


we find the value (after grouping the classes 7 to 12) 
x2 = 152.3514 with 6d. 


As the classes 7 to 12 are grouped, the estimate of » used from the 
ungrouped data gives a slightly larger x” for 6 degrees of freedom. 
As there is no proper method of adjusting for this an alternative estimate 
of « may be obtained by considering the classes over 6 as grouped [for a 
similar method in a contingency table, see Rao (1952) p. 199]. In the 
particular example considered above it is not likely to make any differ- 
ence because the classes grouped belong to the tail of a distribution with 
a small total probability. Further, the computations carried out by 

*In the case of a Poisson distribution the number of classes is theoretically infinite. But for pur- 
poses of computing x? it is necessary to group the classes beyond a certain value of the Poisson variable. 
The large sample distribution of x? is valid when a finite number of classes obtained as stated above is 


considered. In practice this is what is done. Since the tail frequencies will be small, they are combined 
into a single class. In such a case the parameters have to be estimated from the grouped distribution 


if the x? approximation is to be valid. 
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Chernoff and Lehmann (1954) for a Poisson distribution truncated 
from above, show that no serious error is made. Finally it may be 
noted that x’ itself is an approximation and a small refinement in adjust- 
ment for grouping of classes with a small probability is not worth 
attempting. 
Omitting fo = 61, uw is estimated from the formula 
pee el 
jijsis tere 


m = 3.2075 


The goodness of fit x” for the truncated case is 
xi = 31.0660 with 5d. 


Here again the same remarks as in the case of x; hold good. The test 
for the ‘zero’ frequency is 


v= x2 —% = 152.3514 — 31.0660 = 121.2854 


with 1 d.f., which is extremely large indicating something obviously 
wrong. But it must also be noted that even the hypothesis of a trun- 
cated Poisson is not tenable, as x; is also significant. Probably this 
is a case of a compound Poisson with further trouble at the ‘zero’. 
The hypothesis of homogeneity of the observations from a truncated 
Poisson is examined in Section 4. 

Alternatively we could use the likelihood ratio goodness of fit test 
and its decomposition. If fo , fi , --- denote the observed frequencies 
and m> , m, --: those estimated under a given hypothesis then 


Le(p) = dX f, log. - 


r 


For the truncated Poisson 


L(t) = Py, de log, fn 


r>1 ™, 


where m;’s are estimated frequencies for the truncated Poisson. The 
difference L, = Le(p) — Le(t) provides the test for zero frequency. 
The statistic —2L, is asymptotically x” with one degree of freedom. 

One could use the large sample test for an individual frequency 
suggested by Cochran (1954) or the equivalent likelihood ratio criterion 
for examining the zero frequency directly but such a test will not reveal 
the true situation unless the rest of the distribution is known to behave 
as a Poisson distribution. It is with this practical purpose in view that 
the statistics x7 and L, are defined. 
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2.2. Exact treatment of the ‘zero’ frequency 


In Section 1 it was seen (Formula 1.1) that the conditional probability 
of the Poisson variables is multinomial with number of cells equal to 
the number of observations (f) and sample size equal to the sum of 
the observations (7). The zero observation corresponds to an unre- 
presented cell in a sample of size 7. The problem can be posed in a 
familiar way as one of the classical occupancy problems mentioned in 
several text books on probability. From an urn containing equal 
numbers of balls of each of f colours, T balls are drawn. What is the 
probability that f — r colours are unrepresented? This probability is 
easily found to be 


r= OM) (GY + terre 


te OF Led ceed 


Or using the notation of differences of zero the above probability can be 
written 


Peeae (7 Jaro" 
faane 
If T < f, Pr.; automatically takes the value zero for? = 1, 2, ---. 


To test the hypothesis that the observed frequency f, is more than 
the expected we calculate the probability 


f-fo f 
DP: San) as 
1 f=fo—l 


using the convenient expression. If this is less than the level of signifi- 
cance chosen we reject the hypothesis. To test the hypothesis of under- 
representation of zero the probability to be calculated is 


{=fe-l f 
ep pl tay 
1 f-fo 


When both positive and negative departures of f, from the expected 
are considered there is some arbitrariness in judging significance. 
One may follow the rule of rejecting any observed f, not falling in the 
range (0, , 0.) where 0; and 0, are determined from the formulae 


f-02-1 


a f a 
DEP Lee Dain Monee 


ay F=Osed 


in which a stands for the level of significance. 
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The values P, can be obtained from the comprehensive tables for 
A’0” + r! prepared by Stevens (1937) and reproduced in Fisher and 
Yates’ Tables. Stevens considers several examples from genetics where 
this test is useful. In Fisher’s example we find using Stevens’ tables 
for T = 22, 


16 


De, = 00158237 

r=1 
which shows that the ‘zero’ is somewhat over-represented. It is not 
easy to compute the expressions for cumulative probabilities for large 
values of 7. Normal approximation can be used if f is large*. The 
exact expressions for mean and variance are found below. 

The k* uncorrected moment of r is given by the value of the follow- 

ing expression at 7 = 1 


1 AGA (eno aly 


AGN 
"2 = Af Sle ie Nore ge Ne 


vu = (AY - (GS 4)" 4 av - of 2) 
For large f, the normal deviate is 


Te eg ore 
Dor es Tapes AE VF 0) 
In David and Johnson’s (1952) example, the frequency for zero is 
fo = 61 out of a total of f = 265. The value of T = >> f,x; = 682, 
& = 2.5736 | 


{r- (5)} 
V V(fo) V/146.802355 : 


*One of the referees of this paper points out that the fact that normal approximation can be used 
for f large is proved by I. Weiss in 1952 in a thesis entitled ‘Limiting distributions in some occupancy 
problems’’. (Technical Report No. 28 prepared under contract N6onr-25140 for the Office of Naval 
Research at Stanford University, July 1955). 
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which is too large for a normal deviate, establishing significance in the 
frequency of zero. 

If T is large compared to f then P, can be approximated by [see 
Feller, (1950)] 


poe oe ie 
heat)! 
where X = f e’”’. In this case the Poisson tables may be used for 


obtaining the cumulative probabilities. 


3. CONDITIONAL VARIANCES AND COVARIANCES OF THE FREQUENCIES 


Any other frequency or in general any linear combination of the 
frequencies can be tested for deviation from the expected values by 
using the exact variances and covariances and the large sample normal 
approximation: By.using the technique given in Section 1 the condi- 
tional mean and variance of f; , the frequency of the observation 7, is 


wo AN) 


Tif — DG — 2)" ™ 
iT — 22)1f7 


| Biz _ 9)\T-i-i T 
Cov (ff) = PAE DE 2 (1) — nye) 
From the above formulae the exact mean and variance for any linear 
function of the frequencies can be calculated and its significance can 
be tested by using a normal approximation if f is large. In an example 
given by Cochran (1954) where f = 240, T = 388, the value of f; = 52. 
To test whether it significantly departs from the expected we compute 


fs — Efe) _ 52 = 33.60 _ gy 


Vf) = EG) — (EG )P + 


which is significant. The large sample approximation given by Cochran 
(1954) slightly over-emphasizes significance. The exact probability 


for f; quoted by Feller (1950) is 


Gay hi aie 2 Skene sear 
Py = pe CY Gig — ae — lay 


the summation extending over those j > f; for which Fe ERIN) ed 
This is, however, difficult to deal with computationally. 


274. BIOMETRICS, SEPTEMBER 1956 
4. TRUNCATED POISSON 


4.1. The goodness of fit test 


In the case of the truncated Poisson the probability of the observed 
frequencies f; , fz, -*: (with a total of f’) given T = Yuieis 
ee 7 1 
fil fol => T] an" o,f) 


where 


o(T’, f’) 


I 


pr (Dror era 
=e 


The corresponding expression for the full Poisson is simply f”. The 
probability for f, , fo, --: given JT is independent of the unknown 
parameter because of the sufficiency property of the statistic 7 even 
for the truncated case. The likelihood ratio statistic or the x” statistic 


yo filoga! = ue (eee 


Mm,’ 
where 
et Mere 
and m is estimated from the formula 
Tae m 
ieee lice. 


can be used to test goodness of fit in large samples. The remarks made 
in Section 2.1 about the validity of x* approximation hold good in 
this case also. In the case of the likelihood ratio statistic it has to be 
multiplied by (—2) before referring it to the y’-table. If the cell 
frequencies are small the exact cumulated probability for any one of 
the statistics can be calculated by using the formula for exact prob- 
abilities of the different configurations. With small samples the likeli- 
hood ratio test is preferable. : 


4.2. Homogeneity tests 


The likelihood ratio test for homogeneity based on the observations 
X1,X2,°** , + from f’ truncated Poisson populations is 


POISSON DISTRIBUTION 275 


Tue Th 
Oe Oe "(T/ mf! m" 
He (op ,/m;) m;," 
where 
m ~ a Mm; Ly Je 
1 ate! on ane pe and 1 is em = re 


The statistic —2Z,(f) can be used as x” with (f’ — 1) degrees of freedom 
in large samples. In small samples the exact probabilities can be 
calculated from the conditional probability 


a 1 
a!--- x! (7, f’) 
where ¢(7, f’) is as defined in section (4.1). 


The analogue of the index of dispersion statistic in the truncated 
case appears to be . 


L@-H?+e1+m-F7 


where = m/(1 — e”). In large samples this has a x” distribution* 
with (f’ — 1) degrees of freedom. In small samples the statistic to be 
used can be simply > 2; 

Using the afiseteaione of David and Johnson (1952) we found the 
estimate of u to be 3.2075. The index of dispersion D(¢) for the trun- 
cated case is 


(Sat = pa) + alt +m — a) = 1008.9904 © 540.5050 


based on 203 d.f. Using the normal approximation for x’, the normal 
deviate is found to be 5.27. This is somewhat higher than the value 
4.95 of the statistic suggested by David & Johnson, which differs from 
D(#) in the denominator being simply Z. 


4.3. Exact moments of >» x: 
In the truncated case, mean and variance of >) 2; are 
ED a) = FTC — YieT — 2,7) +97 — 2, f' — Y} 
+ fTioT —1,f) +e -1,f' - Die, f) 
(ROS et pbk Lig 1G We te Fh 8 eth HO ee Oy ee 2 oe 
+ ff’ — Det — 4,7’ — 2} 


*T his follows from the general theory of the x? test discussed by Rao (1952, p. 177). 
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+ 7 (2¢'(f' + 261 — 3, f’) + 2f'(2f’ + DoT — 3, f’ — YD 
+ 2f'(2f’ — NeT = 3, f' = 2)} 
+ TFS + G(T — 2, f’) + f'(2f’ + DoT — 2, f’ — I) 
SG! abe rapt eeases 
+ Tf{oT —1,f/) +7 — 1,7’ — DO", f) 
— {E()) a) }° 

where 


T= 7T—1)---(T—-rt+) 


5. SOME EXACT TESTS FOR THE BINOMIAL DISTRIBUTION 


It may be noted that investigations similar to these in the Poisson 
case can be carried out for the Binomial, Normal and other distributions 
where the nuisance parameters can be avoided by considering conditional 
probabilities. If f sets of s trials are made with probability p of success 
in a trial then the probability of frequencies fy , f: , --- f, of the possible 


successes is 
f! § Tr s—r < 
ainsiergateul jb 


Observing that >> rf, = T is sufficient for p and has the probability 


S! 


T S-T 
TS Se ts, 


we find the conditional probability of the frequencies 
fl TS —T)! ()- 
folie 7,1 S! I] r 


Using this expression the exact probability of any criterion of goodness 
of fit can be computed. 


The probability of f, given T and f is found to be 


GF ae ON NE) 


This expression can be used in computing the probability of tests of 
significance concerning the frequency of zero success. 

If f is not small, normal approximation holds good for f, . The 
exact mean and variance of f, are 
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2 (S—s! (S—T)! 
Bh) = fog S! 


— IS — TMS — 2s)! 


VG) = BU) — wor + LS — Mie 


6. TABLES AND THEIR USES 
6.1. Critical values 


Critical values for a level of significance a < .05 of the statistics 
> Ce “} logit. 7D. f, log. (f,r!) and frequency of the ‘zero’ class 
are given for 7 = 3(1) 10 and f = 3(1) 10(10) 100. The exact level of 
significance occurs as a lower entry under each critical value. When 
the probability levels are too low, the next lower value of the statistic 
is also recorded with the corresponding probability level if it is not 
much above 5 percent. If the observed value of a statistic happens to 
be equal to or greater than the tabulated value, the hypothesis is to be 
rejected at a level of significance a < .05. 

It will be seen that in many cases the exact probability (level of 
significance) is much below 5 percent. This is due to the fact that the 
distributions of the statistics under consideration are heavily grouped; 
the inclusion of a smaller realisable value of the statistic would increase 
the probability beyond the 5 percent point. 

The statistic >> f, log. (f,r!) is comparable to x” when the expression 
(T — f) log. f + T — T log, T is added and multiplied by 2. 

For intermediate values of f, critical values of the statistics may be 
obtained approximately by choosing the nearest f tabulated values 
or by interpolation between two values of f. 


6.2. Illustrative examples 


The following frequency distribution is used to illustrate successively 
the use of Tables 4, 1, 2 and 3. 


y 0 1 2 3 
rs 63 5 1 1 
Here f = 70 and = 2 = Dorie = 10 


(i) Test for ‘zero’ frequency (one sided test to examine an excess in 
the expected zero frequency): Observed frequency of the zero 
class fy = 63. The tabulated value for 7 = 10 and f = 70 is seen 
to be 63 with the associated significance level of .01. Significance 
is noted when the observed value is equal to or greater than the 
tabulated value. In this case the deviation from the expected is 
significant indicating an excess in the expected frequency of zero. 
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(ii) Tests for homogeneity: Two statistics >, «a; and >, x; log, x; are 
computed to test whether the individual observations come from 
the same Poisson distribution. 


(a) Le = Deri: 


I 


15+ 4.14 9.1 
= 18 
This is in excess of the corresponding tabulated value 16 for 


T = 10 and f = 70 with a significance level of .03. So the hy- 
pothesis of homogeneity is rejected. 


S* 2, log, a; = >> rf, log. 
= 4.345 


The tabulated value of >> a; log, z; for 7 = 10 and f = 70 is 
3.2 at a significance level of .03. So the hypothesis of homogeneity 
is rejected on the basis of the second test. 
(iii) Test for goodness of fit of the Poisson distribution: 

The statistic >> f, log. (f,r!) is computed to test goodness of 
fit and this comes out to be 271.549. On referring to Table 3 
for the critical values of this statistic, it is seen that the observed 
value is in excess of the tabulated value of 271.29 at a level of 
significance equal to .03. So this indicates that the observations 
do not belong to a Poisson distribution. 


(b) 
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MATCHED PAIRS IN SEQUENTIAL TRIALS FOR SIGNIFI- 
CANCE OF A DIFFERENCE BETWEEN PROPORTIONS 


W. Z. BILLEwicz 


Obstetric Medicine Research Unit (Medical Research Council), 
Midwifery Department, University of Aberdeen, Scotland 


INTRODUCTION 


I. Bross (1952) and P. Armitage (1954) have discussed the applica- 
tion of sequential methods to medical trials, pointing out that not only 
are they economical in use but also keep the effectiveness of a given 
method of treatment under continuous review. Yet sequential methods, 
though firmly established in various other experimental fields, have 
not often been used in medical research. 

One of the reasons for the limited use of these methods is that the 
success of the treatment and its length are frequently related to such 
individual characteristics of the patient as age, sex and severity of 
condition at entering the trial. The chance of contracting a particular 
disease may also vary with such environmental conditions as housing, 
dietary habits and occupation. For such reasons, medical research 
workers may insist on having experimental and control cases similar 
with respect to characteristics which they regard as important in a 
given case. From the statistician’s point of view this approach implies 
single or multiple stratification and a certain degree of correlation in 
response to treatment between members of the same pair. 

Stratification means a more complicated sampling machinery than 
would be necessary for unrestricted pairing, and correlation raises the 
question of the applicability of the usual sequential formulae to a 
matched-pair trial. The usual test based on unrestricted pairing 
will show, with required probabilities, whether the new treatment is 
or is not significantly better (or worse) than the standard, without 
involving us in any difficulties of selection or theoretical problems. 
At first glance, therefore, it might seem that the introduction of matched 
pairs is an unnecessary complication unless it can be shown to result 
in a sizeable reduction in the average amount of testing required to 
reach a decision. 

Even if there is little or no reduction in the average sample size, 
the matching procedure may be worth using for the following reasons. 
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When a research worker insists on matching he usually has two things in 
mind. The first is to eliminate as far as possible those differences 
between the subjects which may confuse the issue, so that when the 
experiment is completed the difference between experimental and control 
groups can be attributed to differential effect of treatments alone. The 
second, less frequently stated, is the possibility that the response to 
the new treatment may vary between strata and that some information 
on this point may help to assess the results or bring to light some 
hitherto unsuspected characteristics of the new treatment. 

On occasions the statistician may suspect that the research worker’s 
request for matching really stems from fear that some “unusual pairing”’ 
will vitiate the results and will consequently reassure him on this point. 
When, however, there are clear cut differences between strata it seems 
wasteful not to use this additional information. In addition the clarity 
of the experiment is obviously improved when extraneous factors 
known to “‘affect’”’ the issue are controlled. 

The usual experiment based on unrestricted pairing can shed no 
light on the differential response of strata while a careful scrutiny 
of a fairly long matched-pair experiment will yield some information 
on this point. To sum up, when there are good grounds for it, matching 
results in an improvement of the experimental design both from the 
medical and from the statistical point of view; it increases the confidence 
with which the research worker will interpret the results; it uses all 
information at our disposal; and the analysis of a completed experiment 
may yield valuable information, otherwise unobtainable. Thus, the 
necessity of setting up a slightly more complicated sampling machinery 
does not seem to be an excessive price to pay for these advantages. It 
will be shown-below that there is no additional cost in terms of sample 
size. 

The usual formulae for the sequential test for significance of a 
difference between proportions are valid when there is no correlation 
in response between members of the same pair. The object of this note 
is to examine the case when, as in matching, such a correlation exists. 
The note arose from a practical situation in which a sequential trial 
was strongly indicated on economic grounds, while the similarity of 
_ experimental and control cases was regarded as very important from 
the medical point of view. 


DEFINITIONS 


For convenience, open, single-tail plans will be considered here. 
The procedure can be easily extended to cover two-tail tests. 
Given the critical proportions of successes, p and p*, (p < p*) for 
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the standard and the experimental treatment and the acceptable errors 
of both kinds*, a and 8, the sequential plan is fully determined. The 
probabilities of success determine the value of 


pl — p) Lee 
7 nae Pe Seen ETN Rigs" WSs queen + > = a 
SP py pl =) A> 5 since p > p (1) 


which together with errors of both kinds determine the upper boundary 
line. For the case of “no difference”, p = p’ is accepted here so that 
for the lower boundary @ = }. 

In a sequential trial subjects are selected in pairs, one member 
being allocated to the experimental and the other to the control group. 
Of all pairs only the untied pairs yield information about the difference 
between the experimental and the control groups. Thus only SF and 
FS pairs contribute to the test, where S stands for “success”, F’ for 
“failure” and the first letter refers to the control group. It will be noted 
that 6, is the expected proportion of pairs favourable to the new treatment 
(FS) among all untied pairs if the population values of the proportions 
of successes in the two groups 7 and z” are equal to p and p’” respectively. 

Using the graphical representation suggested by Armitage (see 
Figure 1) the equations of the boundaries are given by: 


Upper boundary y, = a + sx (2) 
Lower boundary yz = —da, + sz 
where z is the number of untied pairs. 


2t 
ie @) 
logio =a 


When 6 = 


N]Re 


hee ae - (4) 
logo toss 6, 


1 
ODie ae 
ne 40,11 — 41) (5) 


lo bad Be 

sg 10 1 Sag 6, 
ts ee eS 
*The adopted values of « and § depend on the appreciation of risk involved in coming to a ‘‘wrong” 
decision. a is the maximum proportion of instances in a long series of tests in which we may be led to 
conclude that the experimental treatment is superior when this is not true. B is the maximum proportion 
of instances in a long series of tests in which we may be led to conclude that the experimental treatment 
is not superior when in fact the opposite is true. In other words pairs of treatments for which the true 
@ < }willlead on the average to a decision in favour of the standard treatment with relative frequency 
of at least 1 — a, while pairs of treatments for which the true 0 > 61 will lead on the average to a 

decision in favour of the experimental treatment with a relative frequency of at least 1 — 8. 
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where 
1 —- 1- 
t = logio ane ) ip logio c 5] 
a B 


a and 6 being the adopted errors of both kinds. 

In the usual method pairs are drawn at random from the whole 
population. When matching is used pairs are still drawn at random but 
both members of each pair belong to the same randomly selected 
stratum of the population. 

If the population is divided into R strata and pairs are drawn from 
them at random the probability of drawing an FS pair is given by 
pS a,;p.(1 — p;) where p; and p; are the critical proportions of suc- 
cesses in the strata and a;’s their proportional sizes. The probability 
of drawing an SF pair is given by >>% a;p;(1 — pz). Consequently in 
an experiment with matched pairs the critical value of 61, is 

R 


sy a.pi(1 — p,) 


1 


DS a(pi(l — p:) + pA — pi) 


af 


Ou, = 


1 


where 
R R . R 
a ap; = Pp, dX ap; =p and Ya; =1 (6) 
a: 


When p; = p; in all strata, that is when there is no difference between 
the experimental and the control groups, 6,,, = 4 and for the overall 
proportions of successes we have p = p’. 

Given the value of @,,, the boundaries for a matched-pair sequential 
plan are calculated from expressions (2)—(5). 

The average number of untied pairs 7, needed to reach a decision 
depends on the value of @ in the population. Three points are of special 
interest and may be estimated from the expressions below. For an 
experiment with unrestricted pairing we have: 

When the true value of 6 = 06, 


= a; — Bla, + a, 
ee eas ) 


When the true value of 6 = 6 = 4 


hee a2 — aly + az) (8) 


At maximum, when @ about half-way between 4 and 6, 


A;QA2 


Rime oe (9) 


re S—<—C~S; 
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W hen the true value of @ is greater than 6, or smaller than > the average 
Mg 18 smaller than 7%», or %», respectively. Fora matched-pair experiment 
the corresponding values of %»y, , Tigy, and 7, are obtained by 
substituting in the above formulae the corresponding values of param- 
eters of the matched-pair plan. 

To estimate the total amount of testing, that is the total number 
of tied and untied pairs required to reach a decision, we multiply the 
appropriate value vi, by K, . 

We have thus for the trial with unrestricted pairing: 


No, => Tig I @3 3 No, = Tig, Ko, and VE => jee x Yee, (10) 


For a matched-pair experiment substitute the corresponding values for 
Ox. 
In the case of the usual experiment 


1 
K = 
"(1 — 2) + (1 — 7°) aD 
For a matched-pair experiment we have 
Koy = : (12) 


R 


de afri(1 — x.) +2,11 — 7D] 


The probabilities used to estimate K are not necessarily the critical 
values of p and p* but the values of z and x” in the population having 
a given value of true 6 or 6, . Of course when the population @ is 
assumed to be equal to 6, as in (7) the values of 7 and 7m” are the same 
as the critical values used to determine the boundaries. 


SEQUENTIAL PLANS FOR MATCHED PAIRS 


When there are sound reasons for having pairs similar in certain 
characteristics we may assume that there is enough information to 
enable us to obtain estimates of a; and p; reasonably close to the popu- 
lation values. The proportion of successes thought to indicate superi- 
ority of the new treatment will usually be given in terms of p*, while 
the method requires knowledge of the values of p; for each stratum 
separately. Two of the possible methods of determining these values 
will be considered: 

(1) We may assume a constant value of @ in all strata, i.e. a par- 
ticular set of p; values. 

(2) We may assume a proportional change in the proportion of 
successes or failures in all strata, i.e. a particular set of 6; values. 
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Method 1. 


The assumption of constant 6 in all strata implies that the greater 
the proportion of successes already obtained in a stratum, the smaller 
the proportional improvement necessary to make the new treatment 
worth while. The absolute difference between p; and p; depends on 
the value of p; and is generally greatest when p; is not far from 4. 

It will be shown later* that for any set of strata and any set of p; 
values 04, > 6; where 6y, is defined by (6), and also that for any set 
of true proportions of successes, 7; and 7; , resulting in a given value 
of true 6, constant in all strata, the average number of untied pairs 
and the average number of all pairs required to reach a decision is in 
general smaller for the matched-pair trial than for the usual method. 

The value of constant 6,;,, may be found from the equation of the 


type: 


a Ou, D1 | arOu.Dr x 
ss ig Sd RN AE B34 co a 1] 
1 — p, — Ou,(1 — 2p,) as Fel Da — Ou, — 2px)” oe) 


The solution of this equation is rather cumbersome and it is much easier 
to proceed by successive approximation as shown in Table 1 below. 


TABLE 1 
Calculation of Constant 64, 


Strata 
Item SS All 
i Il Ill strata 
Proportional sizes of the strata, a; 2 5 3 1.0 
Standard treatment, p .20 .50 .70 .50 
New treatment critical value, p? — — — .70 
Implied} values of p? for 
Ou, = .72 .38913 . 7200 .8571 . 6954 
Ou, = .73 .4088 . 7300 . 8632 .7057 
Om, = .725 .38973 . 7250 . 8602 .7000 


{These values are obtained from 
ae Ou,Di 
Le Dec Oy, (1 =" 2p,) 


a i eee re ee See a A Pe 
*Sea section: Proofs. 


Di 
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Thus for the critical value of p* = .7 this particular set of strata sizes 
and p, values leads to 04, = .725. Arriving at the value of 64, by 
successive approximation rather than by solving equation (13) has 
the advantage of showing at the same time the values of p; in the 
strata. Assuming the acceptable errors to be a = .025 and (= gle 
and substituting the value of 0,,, = .725 into (3)—(5) we obtain: a, = 7.5, 
a, = 6.1, and s = .2334. The boundaries of the plan are then given 
by (2), as: 


Fe 7.5 + 23342 


I 


and 
ys 


I 


—6.1 + .23342 


The sequential chart corresponding to this plan is shown in Figure 1. 


; SEQUENTIAL CHART 
Yu= 7-54-2334 x 
Yu=-6-1 +:2334 x 


< =-025 
A =-05 


30 35 40 


NUMBER OF UNTIED PAIRS, X 


NUMBER OF FS PAIRS MINUS NUMBER OF SF PAIRS, Y 


FIGURE 1. SEQUENTIAL CHART FOR CONSTANT 0 = .725; SHOWING ONE OF THE 
EXPERIMENTAL SAMPLE PATHS. 


Only the untied pairs are plotted on the chart. The sample path 
starts at the origin and for each pair favourable to the new treatment 
(FS) ascends by one step, while for each unfavourable pair (SF) it moves 
by one step downwards. Thus the first three untied pairs of the sample 
shown in Figure 1 were FS pairs, the fourth was an SF pair, and so on. 
Sampling proceeds until the sample path crosses one of the two bound- 
aries. If the sample path crosses Y, we may conclude that the true 
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difference between the treatments is not zero; whereas if Y; is hit we 
may conclude that the difference, if any, is smaller than the critical 
value chosen at the beginning of the trial. 


Method 2. 


If we want to base our plan on proportional changes in the strata 
we may. take either the proportion of successes or the proportion of 
failures as the starting point. If we assume that the proportion of 
successes changes proportionately in all strata we have 


This implies that the greater the proportion of successes already observed 
in a stratum the greater is the absolute improvement necessary to make 
the new treatment worthwhile. It may, of course, happen that a 
given p’/p = 2d will be found impossible for some strata or will lead to 
unreasonable values of p; . However, the problem may always be set 
in terms of a proportional decrease of failures, so that 


Pie ee PE gels eS 
i Senayie 1—p. 1-7 


In this case, however, the implications are different. Proportional 
decrease of failures implies that the greater the number of successes 
in a given stratum the smaller the absolute improvement required to 
accept the new treatment. In this respect the approach resembles that 
of Method 1, in which the proportional improvement decreased with 
the increase of the proportion of successes already observed in a given 
stratum. ‘To construct a sequential plan on the assumption of pro- 
portional changes we determine 6, , from (6) where p; = dp; and 
\ = p’/p. If proportional decrease of failures is assumed, 6, is ob- 
tained by substituting in (6) values of 


pi=1- Paeall a where )/ = es Be 
r Li) 
The remaining steps are the same as described for Method 1. It can 


~~ be proved that 04, > 6, and that the average sample length is smaller 
than for unrestricted pairing. 


SAMPLING EXPERIMENT 


Ten samples were drawn, with the aid of tables of random numbers, 
according to the rules of each method, to see how the two methods 
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worked under sampling conditions and how they compared with the 
usual method of unrestricted pairing from the whole population. 

The sampling population and changes in the proportions of successes 
implied by each method are shown in Table 2. 


TABLE 2 
Experimental Population and the Assumed Changes in the 
Proportions of Successes 


Strata of the population 
Item All 
I II Ill strata 
Sizes of the strata, a; 2 ate 3) 120) 
Standard treatment, p .20 50 70 .50 
New treatment, p? 
Unrestricted pairing — — — .70 
Method 1, constant 
Ou, = .125 367 725 .860 .70 
Method 2, proportional decrease 
of failures \’ = 1.666 52 .70 82 .70 
If a fixed-size sample were drawn, then with errors a = .025 and 


8 = .05, 150 pairs would have to be drawn, yielding on the average 
75 untied pairs. 

Details of the sequential plans are set out in Table 3. 

The results of the experiments are shown in Table 4 where samples 
are arranged in order of length. The average length of the three groups 
of samples agrees very well with the expected values. There appears 
to be nothing unusual about the scatter of individual sample lengths. 
The average number of untied pairs was 43.3, 31.8 and 35.4 for un- 
restricted pairing, constant 6,, and the proportional change methods 
respectively; all three values agree closely with expectations given in 
Table 3. A check on the average proportions of SS, FF, FS and SF 
pairs showed that these too were in good agreement with theory, 

In any individual sample the proportions of cases drawn from 
various strata may differ a great deal from those assumed. Examination 
of experimental samples did not give any clear evidence that the usual 
sampling variation in the proportions of cases drawn from various 
strata was associated with the length of the sample. In the case of 
samples drawn on the basis of proportional change there was, however, a 
suggestion that samples in which Stratum III, with the lowest 6; and 
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TABLE 3 
Details of the Sequential Plans 


Unrestricted Proportional 
Item pairing Constant 04 change 

6 or Ou, .700 125 .716 

a1 8.6 1 7.9 

s . 2058 2334 2234 
Average number of untied pairs 

When @ = 6, or 6y = Oy, 40 32 35 

When 6 = 0y = 3 32 25 27 

At maximum when @ or 6 

approximately half way between 

4 and @ or Oy, 63 48 53 
Multiplier K or K y* : 

When @ = 6, or Oy = Om, 2.00 2.25 2.16 

When 6 = 04 = 3 2.00 2.27 2.27 

At maximum 2.00 2.27 2.21 
Average number of all pairs 

When @ = & or 06y = Ou, 80 %2 75 

When 0 = 04 = 3 64 56 61 

At maximum 126 110 118 


*For the purpose of calculating K, pi values were treated as the true proportions of successes as 
indeed they are arranged to be in this experiment. The equality of K values in the first column is 
accidental; when t = .5 we have K = 2 irrespective of the value of x”. 


highest K, was over-represented tended to be slightly longer. One 
would expect Method 1 to be the more stable of the two, since though 
K varies from stratum to stratum the critical ratio of FS to all untied 
pairs remains constant. In Method 2 both K and 6 vary from stratum 
to stratum and in extreme strata the value of the latter may differ a 
great deal from that used to determine the boundary lines. 

To check this point five additional samples were drawn by each of 
the two methods; drawing from the strata with the same values of 
p; and p; , but with probabilities .2, .8 and .5 instead of those given in 
Table 2. The average length of the five samples in case of constant 
64 was 76.4 pairs (expected length 72 pairs); the shortest series was 
54 pairs while the longest was 108. In the case of proportional change, 
however, the five samples yielded an average of 144.4 pairs (expected 
length 75 pairs) and the shortest series was 92 while the longest took 
203 pairs before a decision could be reached. This result confirms the 
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inference that Method 2 is more sensitive than Method 1 to the variation 
in the proportions of cases drawn from various strata. The importance 
of variation in the proportions of cases drawn from various strata 
should not be exaggerated since the proportions were deliberately 
arranged to do most harm. Stratum III, i.e. the one with the highest 
K in case of constant 6,, and with the highest K and the lowest 6; in 
case of proportional change, was heavily over-represented. This par- 
ticular distribution of cases served to make the point clear but in practice 
such extreme variation occurs rarely and moreover would be very 
unlikely to persist as the test goes on. It appears that under normal 
sampling conditions both methods of arranging a sequential plan for 
matched pairs yield consistent results, the method of constant 6s, 
being slightly more stable. 


PROOFS 


Various statements made in the preceding sections will now be proved. 

It will be shown now that 6, > 6:. 61,, (6) was defined in such a 
way that Doe a;p; = p’, the critical value for the new treatment. The 
values of p and p* determine 6, for an experiment with unrestricted 
pairing. We will show that, if 0, were assumed to operate in all strata, 
the resulting >>% a,p; would in general be smaller than the critical 
value p* for which the plan is to be designed. It follows from (1) that 
the difference between the critical value p* and }\* a,p? = py may be 
written in this case as: 


R 


eae eg 6,p ‘ed a; Op; 
e ee dice ip = 0:() == s2p) 2 a eee eon 
Remembering that p = >>% a;p,; we arrive at 


p* man D4 ms, 6,(1 PRET 6,)(20, cca 1) 
Tei. 2, Cao2p) 


; de aci(p: — pi)” 

et Lk = Bip) [le pp 6.0 2p;)] 
the summation being taken over all 7, 7 combinations. Since} < @, <1, 
this expression must be positive unless all p; = p when it becomes zero. 
It follows therefore that the constant 0,,, used for a matched-pair plan 
must in general be greater than 6, and that the greater the differences 

between strata the greater the difference between 64, and 0, . 
To prove the same proposition for Method 2, that is assuming 


proportional change in p, in all strata, it is easier to start directly with 
the difference 64, — 0, . 


(14) 
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If the proportionality factor is \ > 1, we may write from (1) and (6) 
R 

iA t 1 mee pea 
2, adil ~ Pe) NL = 9) 
R ~~ 1+2X— 2 
D a,[pi(l — p,) + pill — dp] : 


1 


Ox, == 6, = 


which leads to 


AA — I) 2D aa; (p; = Di)” 
Ou, — 0, = PT (15) 


R 


(1 + A — 2p) 3 a;[Api(1 — p,) + p(1 — dAp,)] 


1 


which is positive unless all p; = p, and therefore Oy = Oa 

Substituting the true @ of the population for 6, , and x for p, we 
may deduce the relation between the true @ and the true 6, on the 
assumption that the proportion of successes in the strata changes 
proportionately to z*/t = X or according to the constant 6, rule. 
From (14) and (15) we see that if @ = 4 (i.e. \ = 1) then whatever the 
values of 7; , 04 = 4; when 6 > 3 then @y > 6, and when 6 < 4 we 
have 6, < 6. These relations will be-found helpful when we pass on 
to consider the average sample size. 

For any sequential trial the average number of untied pairs fi, 
required to reach a decision depends only on the true value of 6. Three 
points of interest on the Average Sample Number curve (untied pairs) 
can be obtained directly from the parameters of the plan; the formulae 
are given as (7), (8) and (9). Using (3), (4) and (5) we may write these 
formulae for 7, as decreasing functions of 6, or Oy, . 

Therefore for the three points at least the average number of untied 
pairs needed to reach a decision is smaller for a matched-pair plan than 
for an experiment with unrestricted pairing since 04, > 6, . Other 
points of the Average Sample Number curve, except the two limiting 
cases of 6 = 0 and 6 = 1, cannot be obtained directly from the param- 
eters of the plan but have to be calculated indirectly. 

Figure 2 shows the Average Sample Number curves (untied pairs) for 
the experimental plans. The scale of @, has been so arranged as to 
show the values of 6, and of the average sample size corresponding 
to given values of @ for the experiment using unrestricted pairing. 
The Average Sample Number curve for the constant 9), plan is pulled 
slightly out of shape as the result of this arrangement. The two curves 
show that for any value of 6 in the population and its corresponding 


proof is simultaneous switching around of the designations “success,” ‘‘failure’’ and ‘‘control,’’ “‘experi- 
mental.’’ Such a change of ‘names’ does not affect 0 so that the problem can then be re-stated with the 


usual designations. 
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AVERAGE SAMPLE NUMBER CUNTIED PAIRS) 
AVERAGE SAME ECS NUNS eee 


UNRESTRICTED PAIRING 


NUMBER OF UNTIED PAIRS 


FIGURE 2. COMPARISON OF ASN CURVES FOR SEQUENTIAL PLANS SPECIFIED IN 
TABLE 3. 


value of 0 , the experiment using matched pairs is likely to be the more 
economical. <A similar picture can be obtained for the proportional 
change method. . 

The average length of the whole experiment depends on the true 
value of 6 in the population and on the values of z and x” for the two 
treatments. Since by matching we make the numbers of each pair 
similar in response the untied pairs must become less frequent; it is 
therefore not surprising that it can be shown that Ky, > K, unless 
ro Ae eee 

For matched-pair plans, part of the gain in the average number of 
untied pairs is offset through K,,, > K, ; we have therefore to consider 
the total sample size. 


Suppose 7; = p; and x; = p; where z; and 7; are linked by a constant 
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81, ; in other other words suppose that our estimates of proportions 
of success in the strata and our guess of the efficiency of the new treat- 
ment are correct. We want to know whether the matched-pair plan 
is more or less economical than the usual method based on unrestricted 
pairing. Let us enquire for what kind of population structure the average 
total sample size is at maximum for a matched-pair plan. 

From (7) and (12) by substituting (3)-(6) we may write* 


1 1| <2 » aipi(l — pi) 
Q= eget oe A p> api(1 — p,) logio Ce Se 
ets > ap.(1 — pi) 
| : 
—Ya(r-p)+pd—p) a8 


Pat 


> an( pi —p) + pt — Pd) 
2yrap(l— Pp) aa 


y logio 


where 


(1 — a)(1 — 8) 

ap 

a and 8 being the errors of both kinds adopted for the trial. 

For the maximum of Ny,, = Koy,flov, we want Q to be a minimum. - 


= il ae 
C= logip —# = B logio 


Write | - teal 
basic soa haee ac 
ey y. Se aes Di api(l — pi) S eel >: 


wa 
Pes ma gs: 


os 


~ ' PU ae re 
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Differentiating with respect to Z, , we find Q to be a decreasing function 
of Z, , so that Q is a minimum when Z, is a maximum. We may write 


Z, as 
R 


Z4,=p—pp — bs Dit ne”) (19) 
For maximum Z, we want the expression in the bracket of (19) to be 
a minimum. Since this expression is zero when all p; = p it remains 
to be shown that it is positive for p; ¥ p. 

It was shown by other means (14), that 04, > 9: ; let us write 
this in terms of Z’s, 


Fs Dil.) 


Ou, - 6 = 53S - Se Oe 
ie *"“44+2Z, pd-»+n0—?) 


(20) 


(p* — p> app: — ve") 
~ (Z, + Z,)[p" — p) + vp — 2D] 


Since the denominator of this fraction is always positive the numerator 
is also positive for p; # p since 04, > 0, . Consequently the bracket 
of (19) is always positive for p; # p. Therefore Z, is a maximum when 
all p; = p. Thus the average total sample size for a matched-pair 
plan is at maximum when p; = p and it is then equal to the average 
total sample size of a plan with unrestricted pairing. 

When the experimental treatment is Just as good as the standard, 
Le. p; = m7; = 7;, writer = 1 — 40,(1 — 0,) andy = 1 — 40y, (1 — @y,). 
Then from (8), (11) and (12) 


1—a_ ad = 8) |] layer apegl | 
| or. B a log, OB Aye tley etek: 


= pl — 2) log. (1-2) — Yap —p) log) 2) 


By expanding the logarithmic series the left hand side can be shown to 
be positive if 


20, — 1)" — 7) <@ox,-D* Daw—p) (22) 


To obtain the condition for 0, satisfying (22) write: 


yf alive and) 2 
p(l — p) 
then from (22) 
6, 1 al 


Oe eas Qu 
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developing yu in powers of «€ 
Ou, > 6, + 3(26; — le + terms in higher powers of «. (23) 


Since ¢ is usually small and as can be seen from (14) the difference 
between @,,, and 6, increases with ¢ it follows that in most practical 
situations (22) is satisfied when 64, > 6,. Thus when 6 = 09 = Ou, = 4 
in the majority of cases the matched-pair trial will be shorter than that 
based on unrestricted pairing. Corresponding proofs for Method 2 
are easier to obtain since the common proportionality factor \ simplifies 
the formulae to a considerable extent. 


DISCUSSION 


P. Armitage (1954) in the concluding paragraphs of his paper 
commented on the possibility of using matched pairs in sequential 
trials. Two methods of arranging a sequential plan for matched pairs 
have been discussed here and shown experimentally to give consistent 
results. 

On many occasions matching, even_if-desirable from the medical 
point of view, may not be practicable. When, however, the flow of 
cases is ample and fairly steady, matching becomes a practical proposi- 
tion. 

In the examples given above, the matched samples were on the 
average about 10% shorter than the series using random pairing from 
the whole population. The differences between strata were, however, 
purposely exaggerated and in practice the difference between total 
sample sizes is likely to be small. In spite of this, matching may be 
advantageous for reasons indicated in the Introduction. 

The formula for 6),, (6) is rather insensitive to moderate errors in 
the estimation of p; and a; so that the method may be used whenever 
reasonably good estimates of these quantities can be obtained. 

If the response in the strata fails to follow the assumed constant 
6x, or proportional change pattern the length of the trial will be affected. 
Whether it will be shortened or lengthened will depend on the direction 
of departure from the assumed pattern and on the values of p; and a; 
in the strata affected. In these circumstances, however, the length of 
the trial is of comparatively little importance since we gain valuable 
information on differential response to the experimental treatment. 
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A NOTE ON THE RANK ANALYSIS OF INCOMPLETE BLOCK 
DESIGNS—APPLICATIONS BEYOND THE SCOPE 
OF EXISTING TABLES 


Orro Dyxsrra, JR.* 
General Foods Corporation, Hoboken, New Jersey, U.S.A. 


1. Introduction 


The use of rank analysis in incomplete block designs using a method 
of paired comparisons is covered in papers by Bradley, either alone 
(1953, 1954a, b, 1955) or in co-authorship with Abelson (1954) and 
Terry (1952). As in these papers, it is assumed that there is an experi- 
ment consisting of ¢ treatments with n repetitions. Furthermore, 
Bradley (1954a) postulates that associated with each of the ¢ treatments, 
denoted by 7, , --- , TJ, , there exist parameters, 7; for 7; , such that 
a; > O and eee az; = 1. The parameters are further defined with the 
probability statement that, if X; generally denotes an observation on 
a sample of 7; , 


(1) P(X; > Xj) = 0,/(m; + 73) 


in the comparison of 7; with T; ,1 4 7. 
Bradley and Terry give the formulas for the maximum likelihood 
estimates of the 7; : 


(2) ee A ae ee CS ea CREEL 
and 

(3) dips = 1, 

and define 

(4) a; = 2n(t — 1) — » Dri ; 


where r;;, = Lif X; > X; and 7r;;, = 2if X; > X; in the kth repetition 
of the pair (i, 7). 


*I am indebted to Mavis B. Carroll and Clifton C. Sutton for a careful reading of the paper and 
their many constructive comments and criticisms. 
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Tables for the test procedures for small treatment and sample 
sizes are provided in the papers by Bradley and Terry (1952) and Brad- 
ley (1954b). The tables cover ¢ = 3,n = 1(1) 10;¢=4,n = 1 (1) 8; 
and t = 5, n = 1 (1) 5, giving the p, to two decimals. 
We use Equation (2) in the form 


a;/n 


(5) eb aN Ca ae ae 


TA 


Using any method of obtaining initial p,; , the resulting first estimates 
are substituted into the right side of (5), the next estimates being 
resubstituted, and so on, until the equalities hold. This is the method 
formerly and still recommended for iterating. The only problem is to 
obtain the initial p; . This iterative procedure must be used even with 
the tables, to obtain more decimals. Some experimentation with the 
procedure will demonstrate the slowness of the convergence to the 
maximum likelihood estimates, the values converging from only one 
side. 

During a seminar at V. P. I. Professor M. G. Kendall suggested 
one method which might give reasonably good first approximations from 
which solutions of the likelihood equation might be found by iteration. 
Letting the number of times 7; > 7; in n repetitions of the pair (2, 7) 
be denoted by a,; , then the first estimates are given by 
(6) Di = Wt ni} Gi (2) 

A shortcoming of the formula is that the more the a;; deviate from 
their expected values, the more inaccurate the first estimates given by 
(6) will become. 

_ The main purpose of this paper is to present a quick and easy 
method of obtaining first estimates of the z, , regardless of the size of 
t and n. In addition, because in certain cases the formula severely 
overestimates some of the first estimates, a means of correcting for the 
overestimation is also given. 


2. Derivation of the Formula Estimating the 7; 


__ We now develop an alternative formula, which does not have the 
shortcoming of the Kendall formula and which in most experimental 
situations will prove superior to (6) even without the correction to be 
described later. It gives first estimates generally closer to the maxi- 
mum likelihood estimates, thus requiring fewer iterations. 
We assume (and later examine the assumption) that the p; in (5) 
are not too different from one another. If the p,’s were all equal to 


RANK ANALYSIS 303 


p; , say, then using the fact that SPs Ro 1 


(7) De en) /(t— 1): 
Substituting (7) for the p; in (5) we finally obtain 

8 os ee eee U8 

®) P= = It — Dp. +11 


which, when solved for p; , yields 
(9) Di = a,/[n(t — 1)? — a(t — 2))*. 


3. Application of the Formulas for Estimating x; 


The experimental example in Table 1 is given to compare the accu- 
racy of the formulas (6) and (9). In this example ¢ = 8 and n = 40. 


TABLE 1 
Comparison of the Formulas (6) and (9) 


Pi Di — i 
i Ri a 
(6)* (9)* (6) (9) 

1 211 208 204 — .003 — .007 
2 164 165 . 163 001 SACU IN 
3 147 151 . 148 004 001 
4 133 135 134 002 001 
5 125 137 . 127 012 002 
6 082 084 083 .002 001 
7 .070 056 072 — .014 .002 
8 . 068 065 069 — .003 001 


*The values determined by each formula have been adjusted proportionately so that DG pt = 13 


In this example the a,;; did not deviate appreciably from their 
expected values, so that (6) could be applied. The column of differences 
between the two sets of first estimates (p;) and the maximum likelihood 
estimates (#;) indicates the superiority of (9), which would be still 
greater if more decimals were required. 


4. Further Improvement of the Formula p; = a,/[n — 1)? — a,(t — 2)] 
The advantage in using the equation suggested here for estimating 
the #; is not so great when the approximation based on the assumption 


*These results were also obtained by C. Y. Kramer and M. C. K. Tweedie at the Virginia Poly- 
technic Institute by somewhat different approaches, independent of the present author and of each 


other, in unpublished work. 
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of equal p,’s is inaccurate (such as occurs when the spread in the various 
x; values is large). This is illustrated by the example in Table 2, 
when ¢ = 4 and n = 144. 


TABLE 2 


Comparison of #; and p; = ai/[n(t — 1)? — ax(t — 2)] 


a a; Ti Pi cpi** Pi — Wi cpi — Fi 

J 271 3593 3594 .38529 0001 — .0064 

2 208 2294 2364 . 2321 0070 0027 

3 193 2064 2121 . 2083 0057 0019 

4 192 2049 2105 . 2067 0056 0018 
Total 1.0000 1.0184 1.0000 .0184 .0000 


*Determined by (9) 
**¢ adjusts the pi proportionally so that Br; cpi = 1. 


It can be seen that each p; in the table overestimates the correspond- 
ing #; , and the amount of overestimation is smallest for the largest p; . 
In general, the spread of the p,;’s (7 ¥ 7) and thus the amount of over- 
estimation is less for the smallest and largest p;’s than it is for the inter- 
mediate p,’s. Since p; > #; for all 7, >>; p; > 1, so that when the p, 
are reduced proportionately to total one, the largest 7; is severely 
underestimated. 

An amended formula, determined empirically, has been used exten- 
sively and invariably improved the approximation to the 7; . In 
most instances indeed the amended formula gave sufficiently accurate 
estimates, with only one iteration necessary to check the convergence 
to the maximum likelihood estimates. We first define 


= [(é s 1)R aa St + a.a.; 
oe nS GDR + > a0.; (2p) — 1 


where 


R= 2 at} 


s= a= al) 


a, =nt—1)—-a,, 


——s 
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and 


Dp: = a,/[n(t — 1)? — a,(t — 2)], equation (9). 


Then an improved first estimate (#;) for the maximum likelihood 
estimate of 7; is given by: 


(11) D: — D: cana lise . 


5. Application of the Correction for p; = a;/[n(t — 1)? — a,(t — 2)] 


The improvement in the estimates given by (9) when corrected using 
(10) is shown in Table 3. The results of Table 2 are used to illustrate 
the procedure. 


TABLE 3 
Application of (9) and (10);4 = 4,n = 144 


t ai a: Di ki Dp: Ri Deas 
1 271 161 3594 .0003 3591 3593 |—.0002 
2 208 224 2364 .0068 2296 2294 0002 
3 193 239 2121 0057 2064 2064 0000 
4 192 240 2105 .0056 2049 2049 0000 
864 864 1.0184 0184 | 1.0000 | 1.0000 0000 

R= >> aj = 190,818 


s= Dd a; = n( 4) = 864 
[(¢- DR - S*//t 
(¢(-D)R-S'+ Viaa;= 8,388 


— 43,510 


It is obvious that the ; are very close to the maximum likelihood 
estimates #; . In this example only one iteration is necessary to obtain 
the #; using p; . 

In computing k; the only term which changes as 7 changes is the 
product a;a,.; , the other terms being constant for a given experiment. 
When the p; are estimated by (9), the correction need be applied only 
when ).; p; is substantially larger than one. 
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6. Conclusions 


In summary, we can list the following two advantages for the 
proposed method: 


a. It is quick and easy to apply. Even when the p; are widely 
diverse the two formulas together will give first estimates very 
close to the maximum likelihood estimates. 

b. The first estimates are not sensitive to deviations from the model, 
depending on the sum of the a;; and not the individual a;; , as 
in the Kendall formula. Being insensitive to deviations from the 
model is not a disadvantage, since the maximum likelihood 
estimates of the +; must first be obtained, in order to test the 
appropriateness of the model [Bradley (1954a)]. 


REFERENCES 


Abelson, R. M. and Bradley, R. A. (1954). A2 X 2 factorial with paired comparisons. 
Biometrics, 10, 487. 

Bradley, R. A. and Terry, M. E. (185° Rank analysis of incomplete block designs. 
I. The method of paired comparisons. Biometrika, 39, 324. 

Bradley, R. A. (1953). Some statistical methods in taste testing and quality evalua- 
tion. Biometrics, 9, 22. 

Bradley, R. A. (1954a). Incomplete block rank analysis: on the appropriateness of 
the model for a method of paired comparisons. Biometrics, 10, 375. 

Bradley, R. A. (1954b). Rank analysis of incomplete block designs. II. Additional 
tables for the method of paired comparisons. Biometrika, 41, 502. 

Bradley, R. A. (1955). Rank analysis of incomplete block designs. III. Some large- 
sample results on estimation and power for a method of paired comparisons. 
Biometrika, 42, 450. 


EXTENSION OF MULTIPLE RANGE TESTS TO GROUP 
MEANS WITH UNEQUAL NUMBERS OF REPLICATIONS 


CiypE Younc Kramer 


Virginia Agricultural Experiment Station of the Virginia Polytechnic Institute 
Blacksburg, Virginia, U.S.A. 


In many fields of research, one is faced with the task of comparing 
the effects of treatments which have been replicated unequally. This 
happens for a number of reasons. In an experiment on animals, some 
may get sick and have to be removed from the experiment. In some 
experiments, the amount of material available for certain treatments 
may not be as much as for other treatments. If the experimenter has 
specified orthogonal contrasts that he is interested in before he runs 
the experiment, one can test the various treatment effects by an F-test 
after the treatment sum of squares has been partitioned into individual 
degrees of freedom for each orthogonal contrast. If the experimenter 
has not specified orthogonal contrasts, one is faced with the problem 
of deciding which treatments are significantly different. 

Several writers, including Duncan, Keuls, Newman, and Tukey, 
have developed multiple range tests to-skow differences among treat- 
ments that have been replicated the same number of times and when 
nothing was specified concerning the treatments. Duncan [1] compares 
the above methods and gives citations. ° This extension to unequal 
numbers of replications will be exemplifiel with reference to Duncan’s 
“New Multiple Range Test,” but is applicable to any of the above 
writers’ tests; all one has to do is use*their tabled ranges. 

In Duncan’s test for an equal number of replications, the differ- 
ence between any two ranked means ws significant if the difference 
exceeds a shortest significant range. This ‘shortest significant range is 
designated by R, and is obtained by tmyltiplying the standard error of 
a mean, 8; , by a given value, zn, , obtained from a table of significant 
studentized ranges which Duncan has‘tabled for both the 5% and 1% 
test. In Duncan’s terminology, 7, is the. degrees of freedom of the error 
mean square and p = 1, 2, --- , tis the number of means concerned. 

Consider an experiment with fivé treatments, A, B, C, D, and E, 
each replicated n times. Suppose on ranking the means from low to 


high one obtains 
ioe Cr Sees, ta Soy 
In order for 3 — Zc to be significant, Ep — £c must exceed Rs = 8225.2, ; 
in order for #3 — Za to be significant, £, — £4 must exceed Ry = 8:24,n,, 
etc. Now if we had only two treatments, B and C, #, — fo would be 
significant if 3 — Ze exceeded Ry = S:22,n,- Since s,, = s/n 
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and s;, = s/n, then 


It is then seen that 


t 
&s = Rp rs . 


if Ba , Ep, °°: , & are based on n4 , Ne , -*- , Ne Feplications, 
2 ms — . . . . 
s;, = s/n;. Now for Z; — Ze to be significant, it is reasonable that 
£p — £c should exceed 


Pare. by 
(+3) caren 


and for ; — @, to be significant, ; — £4 should exceed 


‘ (4 + 1) eee, Boe 


NB Na 


Now 
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of replications will be a conservative test. Evaluation of specific 
significance and prediction levels would be extremely difficult and im- 
practicable. If the number of replications differs greatly, there will be 
an increased probability of a significant difference within a subset of 
ranked means classed as homogeneous by this test. 


NUMERICAL EXAMPLE 
(a) Treatment Means Ranked in Order and Number of Replications: 
F D A B C E 


E 458 4 498 521 528 564 630 

n 3 5 4 3 5 2 
(b) Analysis of Variance: 
Source : Da: M.s. F 
Between Tieatanionts nei; 9306.17 3.88 (P < .025) 
Error 16 2397 .00 
(ce) eis s = 2397 = 48.96 


(d) Significant Studentized Ranges for a 5% Duncan Multiple Range 
Test [1]: << 


ee i et oe a ae 
Nm = 16 | 3.00 a.1b Bea 3.30 3.34 
(e) Appropriate Significant Range Factors Aci = S82%p,16) 


a) Dp (2) (3) Qo) (6) > 
Fae . 146;8850154,.22— 158214, s 1617572163. 53: 4 
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must exceed Ri = 161.57. 


2(2)(5 : 
(f2 — £p) 0) = 132 X 1.69 = 223.08 > Ri, 


therefore, Zz — £p is significant. 
Completing the test, we have 


2(2)(4 
(£2 — Za) ae = 109 & 1.63 = 177.67 > Ri, 
therefore, significant, 
2(2)(3) bd 


(Ez — £p) 243 = 102 < 1.54 = 157.08 > Ri 


therefore, significant, 


3 2(2)(5 

Gx — fo)4/32 = 66 x 1.69 = 111.54 < BS, 
therefore, not significant, 

a ., {253 

ae - 2) OB 
therefore, significant, 


(fo — #,). 28 = 66 X-2.24 


therefore, not significant, 


8. RRB 
= #0 2G) — 70 x 1.78 = 121.10 < Ri, 


therefore, not significant. 
Now we may write 


= 106 X 1.94 = 205.64 > Ri, 


147.84 << Ri, 


II 


and say that any two means not underscored by the same line are 
significantly different, and any two means underscored by the same line 
are not significantly different, in the sense defined. 
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SIMPLIFIED LD,, (OR ED,;,) CALCULATIONS 


Henry J. Horn 
Hazleton Laboratories, Falls Church, Virginia, U.S.A. 


In industry a low-cost screening program to evaluate adequately the 
hazard and/or potency of numerous new chemicals, drugs, compounds, 
and combinations thereof is a fundamental requirement. An idea of 
the range of toxicity or potency can be satisfactorily determined from 
a limited number of animals. Four to five animals per group will be 
satisfactory as an accurate evaluation is usually not required in the 
early stages of development. Furthermore, a costly screening program 
is economically unfeasible as from a multitude of new items only one or 
two may reach widespread commercial usefulness. 

In light of the above, this paper presents a practical method for the 
routine screening to estimate the acute LD,;. (or ED;.) by whatever 
route of administration is desired. By a unique choice of doses a table 
of LD;o (or ED,.) values with their confidence limits can be prepared; 
hence, the mathematical computation is avoided and all that is required 
is to read the values from the tables. Although only LD;, is spoken of 
throughout the remainder of the paper for convenience of presentation, 
the calculation of the ED;, will be identical. 

The statistical method used is the moving-average interpolation 
method as originally presented by Thompson [1]. In this method, the 
logarithm of the LD, is determined by simple linear interpolation 
between moving averages of the doses surrounding 50% mortality; 
the antilogarithm will then yield the estimated LD;» . Weil [2] has com- 
puted tables to assist in the computation of the LD5o when four suc- 
cessive doses (D; , D2 , D; , D.) in geometric progression are used and 
the same number of animals (n) is used at each dosage level: © 
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(1) Log LDs5o log D, + d(f + 1) 
where: D, = lowest dose of the 4 doses used in the calculation 

d = logarithmic increment between logarithms of the 
doses or logarithm of the multiplier relating each 
successive dose to its immediately smaller dose. 

f = tabled value given by Weil [2] for the combination 
of 71 , f2 , 73 , and 7, (actual number of animals 
responding, for example, dead, at doses D, , D2 , 
D, and D, , respectively) determined experiment- 
ally. 


I 


Further simplification of Equation (1) can be accomplished as the 
logarithmic increment is constant between doses; therefore, the one d 
may be dropped when D, is substituted for D, . Furthermore, the 
equation may be written in terms of the LD, rather than the logarithm 
of the LD;).. This can be accomplished by rewriting, as follows: 


(2) LD = antilog [log D, +d X f] 
The value of f will vary from 0 to 1; 
when f = 0 then the LD;, = D, and 
when f = 1 then LD,, = D;. 


The 95 per cent confidence limits may be calculated by the following 
formula: 


(3) Confidence limit = antilog (log LD;, + 2 X d X a;) 


Tables of values of o, are given by Weil [2]. If the value of a, is zero, 
which occurs only when 7, , r, , 73 , and ry are zero mortality, zero 
mortality, total mortality, and total mortality, respectively, the con- 
fidence limits cannot be calculated mathematically but will lie within 
the boundaries specified by those successive doses the lower of which 
produced no mortality while the higher produced total mortality. 
The main purpose of this paper is to present a further simplification 
of the moving average interpolation method. This is accomplished by 
using a dosage increment factor which is an integer root of n; thus the 
logarithmic increment between the logarithm of doses will be a simple 
fraction. Examples are 10, ~/10, etc., and the logarithms are 
1/2, 1/3, ete., respectively. By this choice, it will be found that the 
series of doses will repeat themselves at successively higher multiples 


of 10; for example, the dosage series for a multiplier of 10 may be 
1.00 
expressed as 3.16 X 10° where ¢ = 0, +1, +2, +3, etc. This means 


that the mantissae of the logarithms will be found to repeat themselves 
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at intervals of every other one, two, ete., depending on whether W710: 
V 10, etc., is used, respectively. 

The following series of practical doses are proposed for the estima- 
tion of the LD,, : 


Multiplier = 10 Multiplier = ~/10 
Log increment = 1/2 Log inerement = 1/3 
pee 1.00 : pert 
osage series a x 10 Dosage series (2.15? X 10° 
4.64) 
Dosage level, Dosage level, 
mg/kg mg/kg 
10.0 10.0 
31.6 21-5 
100 46.4 
316 100 
1000 215 
3160 464 
1000 
2150 
4640 


It will be noted that doses range from a low of 10 mg/kg to a high of 3 
to 5 g/kg; this is adequate for screening purposes. If a compound has 
an LD,;, below 10 mg/kg it will be known to be extremely toxic and, 
conversely, if the LD;, is greater than 3 to 5 g/kg, it can be assumed 
to be relatively nontoxic. 

As there is a cyclic repetition of dosage increment levels, there will 
also be a cyclic progression of corresponding LD; values. Hence, if a 
table of the LD,, is prepared for one cycle, say between 1 and 10, then 
the values for any other cycle will be 10° times the tabulated values 
(where ¢ = 0, +1, +2, +3, etc.). The LD; values must be tabulated 
according to the doses (D, and D;) that straddle the LD; ; that is, 


with a dosage series ae < 10’, two columns of LD;) values are 


presented, the first column covering those which lie between 1.00 X 10° 
and 3.16 X 10’, while the second column covers those which le between 
3.16 X 10‘ and 10.0 X 10°. As the multiplier becomes a higher root 
of 10, the number of doses per cycle increases; hence, an increasing 
number of columns of LD,;,. values must be listed (equal in number to 
the power of the root). 
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Four tables of LD. values and their 95 per cent confidence limits 
have been prepared: Table 1 for n = 4 (4 animals at each dosage level) 
and a multiplier of 4/10 between successive doses; Table 2 for n = 4 
and a multiplier of ~/10 ; Table 3 for n = 5 and a multiplier of /10; 
and Table 4 for n = 5 and a multiplier of ~/10. These tables have 
been computed using the values of f and co, as given by Weil [2]. 

In practical application, the series of dosages specified in the pre- 
vious lists will be administered. Selection of the number of animals 
per dose (i.e., 4 or 5) and the particular series of doses (ie., using a 
multiplier of +/10 or ~/ 10) will be determined usually by economics. 
If the range of toxicity is to be determined in larger animals such as 
rabbits, cats, etc., four animals per dosage level and the series correspond- 
ing to the 4/10 will be more economical. However, if smaller animals 
such as mice or rats are used, the use of five animals per dosage level 
and a series of dosages corresponding to v/10 will usually be found 
feasible. 

An experienced technician can also effect further improvement in 
economy by selectively administering only those dosage levels which 
are necessary for the determination of the LD,;,. For example, if the 
LD; is below 10 mg/kg the administration of only the low dosage 
levels will be adequate; conversely, if the LD5. is above 5 g/kg, the 
administration of the higher dosage levels only will be adequate. Fur- 
thermore, it is often possible to make a rough guess as to the LD; and 
administer only those dosage levels in that vicinity. If there is no 
knowledge of the LD; , then experience has shown that administration 
of the middle doses will often clarify the situation. For example, if all 
the animals at say 215 mg/kg die within 5 to 10 minutes, then only 
the lower dosage levels need be given. 

Occasionally the mortality results of the series of dosages will yield 
no mortality for the lower doses followed by an immediate transition to 
total mortality for the successive higher doses. This means that for 
the particular compound under study, the regression line is relatively 
steep. Under these circumstances, it will be impossible to determine 
confidence limits although it will be quite obvious that these values 
lie between those doses at the transition point. If, or when; a more 
_ accurate LD, need be determined, the range of doses for the new study 
will be apparent. 

In addition to the convenience of determining the LD; , other 
advantages arise from the use of this scheme of doses. The preparation 
of solutions containing the test material will be simplified since 1 to 10 
and 1 to 100 dilutions will yield the correct dosages at the lower levels 
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when the same quantity per kg. is administered; that is, the factor for 
the quantity of material to be administered per unit of body weight 
will remain the same in each dosage cycle. It will also tend to maintain 
the quantity administered to all the animals within closer limits. 

As only four or five animals are used per dosage level, the selection 
of animals for each group in the study is most important. Because 
individual variations may be marked, poor selection will produce an 
irregular progression in the mortality results. In order to keep this 
irregular progression to a minimum, a stratified random sampling of 
animals is recommended. This is obtained, for example, by placing in 
each dosage group equal numbers of individuals having each of the 
following characteristics: heaviest weight, lightest weight, average 
weight, most agile, most lazy, etc. Randomization is used to allocate 
the animals from each category to all the dosage groups. Animals of 
questionable health must be excluded. 

The following example will serve to illustrate the use of the simplified 
technique presented here: It was desired to determine the rat oral LD; 
of an unknown compound on which nothing was known relative to 
toxicity. Five rats per group were used with the dosage progression 

1.00 

2.157? X 10° as described above. The middle dosage of 215 mg/kg 

4.64 
was administered first and after one-half hour, signs of ataxia were 
observed. It was therefore felt that administration of the next higher 
dose and all the lower doses would result in bracketing the LDs . 
Results at the end of seven days were: 


Dose Mortality 
mg/kg per group 
10.0 0/5 
21.5 0/5 
46.4 1/5 
100 2/5 
215 4/5 
464 5/5 


Inspection indicates that either the series of doses with corresponding 
mortality of 0, 1, 2, 4, or that with 1, 2, 4, 5 should be analyzed. A 
search through Table 4 shows that the series 1, 2, 4, 5 is listed and that 
the estimated LD,» is 110 mg/kg with confidence limits of 55.0 to 220 


mg/kg. 
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SUMMARY 


The moving average interpolation method has been summarized. 
A far more convenient and rapid use of this method may be achieved 
by choosing a dosage increment factor which is an integer root of 10; 
e.£., V/10, ~/10. This produces a repetition of the dosage levels at 
successive multiples of 10 and simultaneously allows for the prepara- 
tion of tables giving estimated LD,;. or ED;, and their confidence 
limits directly from the observations. Four tables have been prepared 
using parameters which experience has shown to be most useful. It 
must be emphasized that this method is not applicable for an accurate 
determination of the LD;. or ED; , but, rather, for estimating the 
range of toxicity and/or potency as in screening programs. For this, 
the economic advantage far overshadows any disadvantage due to 
inaccuracy. 
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TABL 


Values of estimated LD,, and confidence limits for geometric dosage 
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TABLE 1 (continued) 


Di = 0.316 Di = 1.00 
Ti, Te teens Dz = 1.00 D2= 3.16 i 
or Dimes doles Die wf 
Pees GT ai re Ds = 10.0 Ds = 31.6 
Confidence Confidence 
LDso limits | LDso limits 

2 ik 2 3 3.16 0 .0472—212. 10.0 0.149 —669. 

2 1 3 3 1.00 0.0149— 66.9 3.16 0.0472—212. 

2 YD 2 3 1.00 0.0100—100. 3.16 0.0316—316. 
0 0 4 2 3.16 0.837 — 11.9 10.0 2.65 — 37.8 
0 1 3 2, 3.16 0.386 — 25.9 10.0 1.22 — 81.8 
0 a 4 2 1.78 0.471 — 6.72 5.62 1.49 — 21.2 

0 2 2 2 3.16 0.316 — 31.6 10.0 1.00 —100. 
oO 2 3 2 1.78 0.271 — 11.7 5-62 0.858 — 36.9 
Pane 2 4 2 1.00 0.265 — 38.78 3.16 0.837 — 11.9 
0 = 3 2 1.00 0.196 — 5.09 3.16 0.621 — 16.1 

1 0 4 2 3.16 0.221 — 45.2 10.0 0.700 —143. 

1 1 3 2 3.16 0 .0472—212. 10.0 0.149 —669. 
1 it 4 2, 1.00 0.0385— 26.0 3.16 0.122 — 82.1 

if 2 2 2 3.16 0.0316—316. 10.0 0.100—1000. 

i 2 3 2. 1.00 0.0149— 66.9 3.16 0.0472—212. 

0 2 3 it 3.16 0.0472—212. 10.0 0.149 —669. 
0 2 4 1 1.00 0.0700— 14.3 - 3.16 0.221 — 45.2 
0 3 3 1 1.00 0.0385— 26.0 3.16 0.122 — 82.1 

0 1 4 al 3.16 0.122 — 82.1 10.0 0.385 —260. 

S 
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i ee he. 6.460) hoe 
Bas eee va Dz = 1.00 


: ~~ Ds eats 
a naman ras 4=4. 


- X 10¢ 


319 


LDs (OR EDs) CALCULATIONS 


(continued) 


? 


TABLE : 


=) 
= 
x 
a 
moon 
N+HOn 
aN 
lou i oi 
Sa. 
QAARA 
=) 
= 
x 
Caan 
© 1) =H 
Sn oO 
aHAHO 
me 
tow wet 
Si a at oe 
QAARA 
=) 
= 
x 
H 
coon + 
toro 
Onn wt 
ound 
a se oom 
ARAA 
~~ “< 
» 
“ a 
» = 
ml dere 
) 
Pa * 
» 
=“ of 
— x 


Confidence Confidence 


Confidence 


limits 


LDso 


limits 


16—— 


LDs5o 
3 


limits 


LDso 


8.00 7.74 3.48 — 17.2 


9 


.5 


1.67 0.750 — 3.71 


i} 


5.99 2.69 — 13.3 
4.64 2.25 — 


9.57 


4.44 
6.86 
5.48 


1-91 — 11,3 


1.04 — 


-15 
2.78 


4 


5.99 2.43 — 14.8 


4.64 
10.0 


1-15 —= 


0.524 — 3.18 


.29 


1 


N 


WPS 2 tare 
4.12 — 24.3 


4 


N 


4.64 


0.888 — 5.23 


2.15 
1.47 


N 


6.81 2.81 — 16.5 


tf AE 


3.16 1.30 — 


0.605 — 3.56 


4.64 1.91 — 11.3 


10.0 


3 


N 


5. 


0.888— 


1.00 0.412 — 2.43 


3.38 — 29.6 


IOV e——— LS 27, 


4.64 


0.728 — 6.38 
1.47 0.419 — 5.14 


N 


1.95 — 23.9 


6.81 


3.16 0.903— 11.1 


2.15 


4 
4 
4 


8.75 4.64 1.14 — 18.8 


0.531— 


1.00 0.246 — 4.06 
1.00 0.215 — 4.64 


2.15 0.464— 10.0 4.64 1.00 — 21.5 
4.64 0.789— 27.3 10.0 


2.15 


Le /0) S589) 


2.15 0.366 —12.7 
1.90 0.114 — 8.77 
2.15 0.246 —18.9 
1.00 0.0607—16.5 
poe es a GeO 


gs oy i Ee te 


4.64 0.529— 40.7 


10.0 


0.246— 18.9 


4 


a8 Tals 


4.64 0.529— 40.7 
2.15 0.131— 35.5 


4.64 2.25 — 


4.64 0.282— 76.5 


10.0 


4 
3 
3 
3 


4.85 — 20.6 


9.57 
5.05 


— 4.44 
— 2.35 


7.14 «56.50 — 10.9 


10.0 


3.59 2.56 — 


3.93 — 25.5 


4.64. 1.82 — 11.8 


3.59 


2.15 0.846 — 5.48 


7.44 3.48 — 17.2 


8.00 
4.77 
8.86 
6.19 
3.89 
4.44 


1 Ol 


1.67 0.750 — 3.71 
1.29 0.753 — 2.21 
1.67 0.676 — 4,11 


3 


5.99 3.50 — 10.3 


2.78 1.62 — 


7.44 3,14 — 19.1 


3.59 1.46 — 


9.99 ~2.69 — 13.3 
4.64 2.57 — 


2.18 1625 — 


1.29 0.580 — 2.87 
1.00 0.554 — 1.81 


8.38 
9.57 
3.38 — 29.6 


4.64 2.25 — 


10.0 


Zao 119 
L.04 — 


2.15 


1.00 0.485 — 2.06 
2.15 0.728 — 6.38 


4.64 1.57 —13.7 


G28 a8 0 —— 110 


5.44 


ede hack 
4.64 1.14 — 18.8 


1.47 0.853 — 2.53 


2.46 — 40.6 


6.81 2.02 — 22.9 


10.0 


2.15 0.531 — 8.75 


3.16 0.940— 10.6 
2.15 0.728— 6.38 
3.16 0.807— 12.4 


2.15 0.531— 


1.47 0.436 — 4.94 
1.00 0.338 — 2.96 
1.47 0.375 — 5.75 


3 
3 


4.64 1.57 — 13.7 


6.81 1.74 — 26.7 


4.64 1.14 —-18.8 


8.75 


1.00 0.246 — 4.06 
2.15 0.246 —18.9 


3 
3 
3 


1.14 — 87.7 


4.64 0.789— 27.3 


10.0 


4.64 0.529— 40.7 
2.15 0.366— 12.7 
4.64 0.282— 76.5 
2.15 0.131— 35.5 
2.15 0.100— 46.4 


4.64 1.91 — 11.3 


1.00 0.170 — 5.89 
2.15 0.131, —35.5 


0.607—165. 


4.64 0.282— 76.5 
4.64 0.215—100. 


10.0 


4.12 — 24.3 
2.46 — 40.6 


6.81 2.81 — 16.5 


10.0 


-888 — 5.23 


-0464—21.5 
2.15 0.531 — 8.75 


.0607—16 .5 


0 
0 
0 


10.0 


4.64 1.14 — 18.8 


2 


7.67 


BLO toOe—— 
4.64 1.00 — 21.5 


1.47 0.605 — 3.56 
2.15 0.464 —10.0 


2.15 — 46.4 


6.8f° 1.95 — 923.9 


10.0 


3.16 0.903— 11.1 
2.15 0.888— 5.23 
2.15 0.728— 6.38 


1 OA to oe Le 


2 
2 
2 


4.64 1.915-— lis 


1.00 0.412 — 2.43 
1.00 0.338 — 2.96 


4.64 1.57 — 13.7 


lo>) fa wo ont 
omwownooroh 
MWontnh ON tH OO 
Week 
ee ey 
rFeOoOnNntnNorwoe 
moO Oe: C.0) 6 Ovo 
sH sH sH 
cotoSodSo 
Ses eer 
oo 19 ~norak 
reowmnownsd COIS 
ARKSSRAAS 
iced, el eaten 
Oooxntrnrm woo wt 
RAARAARAD 
oo © © © oo © > 
sH Lo} wD i's) 
HH NHA HANA HH 
Re i 
mohton Sho 
N19 0 O OO iM 1h DO OH 
Ts) eee 
¥ 
Saxesnexs 
Arn non AANA 
© 66-90 '0"'o,o18 
wD 1d wD 
eessessss 
NANA N TA RTA 


NNANNAHHA AH 


Hot NO oO HO oH 


SA ANANA OS 


men ntnOoooS 


BIOMETRICS, SEPTEMBER 1956 


320 


TABLE 3 
Values of estimated LD; and confidence limits for geometric dosage 


, +3, etc. (multiplier 


2 


1.00 
3.16 


+/10). 


series 


\ < 10° where ¢ = 0, 41, + 


or 
PA 5 T8hy G2 Gite 


Five animals per dosage level. 


11,725 73, 74 


Confidence 


Confidence 


limits 
5.07 —15.7 


LDso 
4.47 —11.2 


limits 
1.60 — 4.95 


ee 3) 


LDso 
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7.08 
5.62 


2.82 
2.24 
1.78 
2.82 
2.24 
1.78 
1.41 
2.24 


-55 


1) 


1.36 — 5.84 
1.08 — 4.64 


0.927— 3.41 


3.42 —14.7 
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0.568— 2.35 


0.555— 4.27 


17 — 9210 
3.01 —22.6 
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3.27 — 9.68 
2.08 —32.7 
1.67 —1879 
1.40 —10.5 
1.88 —36.3 


0.958— 7.15 
1.03 — 3.06 
0.658—10.4 
0.528— 5.98 
0.442— 3.32. 
0.594—11.5 


1.21 
2.61 
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TABLE 3 (continued) 
Di = 0.316 Dy = 1:00 
TE 524 FS Te Dz: = 1.00 D: = 3.16 
or De= Saelen”: Deetio.0N 
TL, Ts, Ya, Ta Ds = 10.0 Di = 31.6 
Confidence Confidence 
LD limits LDso limits 
1 2 3 4 1.78 0.423— 7.48 5.62 1.34 —23.6 
1 2 4 4 eA 0.305— 4.80 3.83 0.966—15.2 
af. 3 3 4 5 Uae | 0.276— 5.33 3.83 0.871—16.8 
2 0 + + 2.37 0.539—10.4 7.00 1.70 —33.0 
2 0 5 4 tes 0.446— 3.99 4.22 1.41 —12.6 
2 1 3 4 2.08 0.307—18.3 7.50 0.970—58 .0 
2 1 4 4 1.33 0.187— 9.49 4.22  0.592—30.0 
2 2 2 4 2.37 0.262—21.4 win00) 0.830—67.8 
2 2 3 4 1.33 0.137—13.0 4.22  0.433—41.0 
0 0 5 3 B.@1) Sate 871 8.25 3.77 —18.1 
0 1 4 3 2.61 0.684— 9.95 8.25 2.16 —31.5 
0 1 5 3 1.78  0.723— 4.37 5.62 2.29 —13.8 
0 2 3 3 2.61 0.558—12.2 8.25 1.76 —38.6 
0 2 4 3 1.78  0.484— 6.53 5.62 1.53 —20.7 
0 2 5 3 1.21 —0.467— 3.14 3.83 1.48 — 9.94 
0 3 3 3 1.78  0.434— 7.28 5.62  1.37'—23.0 
0 3 4 3 1.21 0.356— 4.12 3.83 4.131310 
1 0 5 3 2.37 0.793— 7.10 7.50 2.51 —22.4 
1 1 4 3 2.37  0.333—16.9 7°50 1.05 —53.4 
1 1 5 3 1.33 0.303 5.87 4.22  0.958—18.6 
1 2 3 3 2.37 0.244—23.1 7.50 0.774—73.0 
1 2 4 3 1.33 0.172—10.3 4.22 0.545—32.6 ° 
1 3 3 3 1.33 0.148—12.1 4.22 0.467—38.1 
TABLE 4 
Values of estimated LD;, and confidence limits for geometric dosage 
1.00 
. t . . —- 
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4.22 2.63 — 6.76 
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.67 — 13.5 
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.60 — 9.50 
.44 — 7.30 
43 — 9.95 
.08 — 8.14 
.66 — 14.6 
.98 — 11.3 
.87 — 7.87 
.94 — 16.7 
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TABLE 4 (continued) 
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A SIMPLE METHOD FOR FITTING AN ASYMPTOTIC 
REGRESSION CURVE 


H. D. Patrrerson 
Rothamsted Experimental Station, Herpenden, Herts., England 


Introduction 
The asymptotic regression equation ~ 
y=a-— Bp where 0<p<1 


is frequently used in agricultural research to express the relationship 
between the yields of a crop, y, and the amounts of a given fertilizer, 
x, applied to that crop. This equation is slightly different to that used 
by Stevens (1951) in that a negative sign is given to the term Gp’. 
This change allows y to increase with positive values of 8. 

Stevens (1951) has described an efficient method for determining 
the parameters a, 6 and p. He provided tables for 5, 6 and 7 equally 
spaced ordinates which considerably reduce the arithmetic labour of 
his method and which have the valuable feature that they enable the 
variances and covariances of the estimates of a, 8 and p to be directly 
determined. These tables have been extended by 8. Lipton on the 
electronic computer at Rothamsted to cover the range of n = 3 to 
n = 12. Since Stevens’ method is one of successive approximation, 
it is admirably suited for use on the electronic computer. 

Pimentel Gomes (1953) has shown that, with equally spaced ordi- 
nates, efficient estimates of p can be obtained by solving equations of 
the type 


J oYo = * Jy oe, anil haiy 0 pl) 


where the J are complicated polynomials in r, the estimates of p. He 
also provided tables of the numerical values of the polynomials for 
the cases n = 4 and n = 5. In practice r can be obtained Seay 
using these tables by a process of trial and error. 

The present paper deals with simple methods of estimation, for Ae 
cases n = 4, 5, 6 and 7, which require the use neither of complicated 
machinery nor extensive tables. The methods suggested are not fully 
efficient (except in special circumstances) but the loss of precision is, 
in most cases arising in practice, small. 
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Only the estimation of p will be considered in detail as the estimates 
of a and @ are given by the linear regression of y on r*, where r is the 
estimate of p. 

For convenience the ordinates will be taken to be 0, 1,2 --- — 1. 
The estimates r will, therefore, normally require to be converted to r'”’, 
where s is a single increment of fertilizer in standard units. 

The method of deriving the estimates considered in the present paper 
depends on the result that the efficient estimate of p can always be 
expressed as the ratio of two contrasts between the y with coefficients 
given by functions of p. It can be expected that the ratio obtained by 
using coefficients appropriate to some value of p, equal to po say, will 
be most efficient for a range of values of p around p) . In fact pp can 
be selected so that the whole useful range of p is covered with reasonable 
efficiency by a single set of coefficients for each of the four cases con- 
sidered. 


Estimation of p with four equally spaced levels of fertilizer 


With three levels of fertilizer full information on p is provided by 
the estimate 


Pe EER 
Up fs 
This equation can conveniently be rewritten 
Yo— (1 +r)y + ry = 0. (2) 


When a fourth level of fertilizer is included r can still be determined 
from Equation (2), or from 


ys — (A +rny. + ry, = 0, (3) 


or from a combination of Equations (2) and (3). Thus the sum of 
Equations (2) and (8) leads to the estimate 


i) ous, 
Y2— Yo 
which has frequently been recommended. 

It is clear, however, that the relative amounts of information on p 
provided by Equations (2) and (3) depend on the true value of p. Thus, 
if p is small the range of values from y, to y2 will be greater than that 
from y,; to y; and Equation (2) will provide more information than 
Equation (3). If, on the other hand, p is nearly 1 so that the responses 
to each increment of fertilizer are almost equal, Equations (2) and (3) 
will provide approximately equal amounts of information. 


T= 


(4) 
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The equations can, therefore, be combined with the relative weights 
u: 1 where pu is to be determined: 


Ytuwo—r- 1)y2 — (u + pr — ry: + LYYo = 0. (5) 
The estimate of r given by this equation is 


» = Yt = ye — pyr, 6) 


Yt ea) yi — ey 

In order to utilise all the information provided by Equations (2) 
and (8) u should be chosen so that the variance of r is minimised. The 
required expression for u is 


_ p +4p° +3p+2 
Ee IS ee eh (7) 
p p p 

It is interesting to note that with these values of u the estimates 
(6) are fully efficient, i.e. the same as those obtained by the methods of 
Stevens (1951) and Pimentel Gomes (1953). A proof of this statement 
will be given when higher values of n are considered. 

The values of » range from p = 2 when p = 0 tow = 1 when p = 1. 
When pu» = 1 equation (6) reverts to equation (4). The estimate 
(ys — Y:)/(Y2 — Yo) is therefore likely to be suitable only for high values 
of p, with efficiency decreasing with p. Similarly (yz + y. — 2y,)/(y2 + 
Y; — 2yo) will be most efficient when p is small. 

It seems reasonable to consider the estimate (6) when an intermediate 
value of wisused. A value of » = 1.25, corresponding to p approximately 
equal to 0.26, has been chosen. The estimate 


_ 4Ys + 2 — 5 8 
fi 4y2 + yi — 5Yo ®) 


has an efficiency of over 95% over the whole practical range of values 
of p, as shown in Table 1. 


Estimation of p with five, six, or seven equally spaced levels 
For n = 5, 6 or 7 p can be estimated in a similar manner from the 
ratio of two simple contrasts between the values of y. Additional 
equations such as . 
“fa (yet ys = 0 (9) 
are combined with Equations (2) and (3) using weights 1, Ha 5 Ma ete. 
The general form of the estimates obtained in this way is 
— Yn-1 a (u1 ae 1)yn-2 Se (us = Hi) Yn—3 eh ee (Un—3 ice n—4)Y2 — Bn-3Y1 | 


Gas oe (u1 Rn 1)Yn-3 + (ue — Ma)Yn—4 ea (Mn—3 . Mn—4)Y1 —| BMn-3Yo 
(10) 
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TABLE 1 
Percentage Efficiencies of Proposed Estimates of p 


Equa- 
Noo (8) (17) (18) (19) 
yer Ss n=4 n=5 n=6 n=7 
BX 

0.0 89.3 77.4 65.2 60.0 
0.1 95.4 88.5 79.2 74.5 
0.2 98.6 95.7 90.2 86.8 
0.3 99.8 99.2 97.0 95.0 
0.4 99.9 99.9 99.6 98.1 
0.5 99.4 98.9 98.8 96.8 
0.6 98.5 97.0 96.1 92.4 
OR 97.4 94.7 92.5 86.9 
0.8 96.3 92.3 88.8 81.3 
0.9 95.3 90.0 85.2 76.3 
1.0 94.2 87.8 81.9 72.0 


Values of the u can be chosen so that these estimates are the same as 
those-given by Stevens. For the purpose of proving this statement it is 
convenient to rewrite Equation (10) as 


ph =10; (11) 
where the \, are subject to the restrictions 
>A. =0; > rv, = 0, and «>> ar "A, = 1. 


The first two restrictions arise because there are n values of d, but 
only n — 3 values of yu. 
If the \, are chosen so that 


Az = | ee + Ler’ + | een a (12) 


Equation (11) and the three restrictions become 


Fe Dy +P Dy +F., Day" =0, (13) 

nF, +h, Or +P, Det =0, (4) 

Be Qo pb Pyare a a at |= 00. 8) 

and Bing D5 Fe Hey) De ay" be RG. arte ee Ts (16) 


Stevens has shown that his estimates of p satisfy Equation (13), 
where ’,, , F’,, and F,, are determined from Equations (14), (15) and (16). 
Consequently Equation (10) must include, as particular cases, the efficient 
estimates given by Stevens. As these estimates are efficient they can 
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also be obtained by minimising the variances of the expression given in 
(10). 
It is of interest to note that, for a given r, 


eee 
See ited, Ja 


vine the J are the polynomials used by Pimentel Gomes in Equation 
rT): 

Values of 4; , uw. --- appropriate to various values of p in the range 
p = .3 to p = .5 have been determined for the cases n = 5,.6 and 7, 
using the F,, , F,, and F,, provided by Stevens. The following values 
have been found to lead to estimates of high efficiency (over 90%) over 
the practically useful range of p:— 


oe ee .75,-- hy, = 10: 
n=6: in = 2; fg = 2.5; a = laror 
nm=7: m = 2, Mo = Ws = 3, pba = 2. 


The actual estimates are 


4y4 + 3Y3 = Yo by; 
= OO = 
4y, + 3Ya — Ys — OY for n = 5, (17) 
Ay, + 4y, + 2y3 — 3y2 — Ty; 
a ee re a 
Were tee Beh ee ae 
he pa Yet ys t hs — te — 2m Peg mre (19) 


Ys + Ye + Ys — Yr — 2Yo 


In the case n = 7 slightly better results can be obtained by using 
fa = 2.2, wo = 3.1, ws = 3.3 and ws = 2.2. These values give almost 
fully efficient estimates when p is about 0.4. The values given above do 
not give efficient estimates for any p but have the advantage of being 
whole numbers. 

The estimates (8), (17), (18) and (19) are likely to be most useful 
when values of p are required from a number of similar experiments, 
possibly to check on an assumed value. Thus many problems of deter- 
mining optimum dressings of fertilizer can be solved satisfactorily by 
using the values of k = —log,. p determined from large numbers of 
experiments and suggested by Crowther & Yates (1941). In such cases 
the assumed values of p can be rapidly and easily checked. Full effi- 
ciency is not required for such purposes. 

The proposed estimates can also be used to advantage as preliminary 
estimates of p in Stevens’ efficient method, when the computations 
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have to be carried out on a desk machine. In most cases only a single 
iteration is required. 
Example 


As an example of the procedure when n = 5 the data given by Stevens 
(1951), p. 261, will be considered. The y are total yields of wheat in 
kgs over five plots of a Latin square: 


Yy: 44.4 54.6 63.8 65.7 68.9 Mean 59.48 
ape 0 1 2 3 4 


The estimate of p is simply 


_ 4X 68.9 +8 X 65.7 — 63.8 — 6 X 54.6 ~ p gi 
ey Ga PEM a ee ie ee er 
Values of r” are: 

r: 1 0.610 0.372 0.227 0.188 § Mean 0.4694 


The regression of y on 7” is 


D C7 a DN r’) ae ONES 
3 (r° aS ry? 
This is an estimate of —8. a is estimated by 


59.48 + 28.66 X 0.4694 = 72.93. 


These estimates show close agreement with the efficient estimates which 
Stevens gives as: 


a = 72.43 3.5.57, 6B = 28.24 + 5.41, p = 0.597 + 0.140. 


When used as a preliminary estimate in Stevens’ method, r = 0.610 
leads to the following estimates of a, 8 and p: 


a= 72.38, B= 28.21, p = 0.595. 


Efficiencies of the proposed estimates 


Stevens showed that the variances of the efficient estimates of p are 


~_ given by 


(ee 
i= 


The variances of the estimates given above can be obtained by applica- 
tion of the usual formula for the variance of a ratio: 


var (5) = (2(5)) Laie + ae - Ba 
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When applied to the efficient estimates this formula gives results identical 
with those provided by Stevens. 
Thus the efficiency of (4y; + y, — 5y1)/(4y¥2 + y: — 5yo) is 


F,.(4p° So Ser 5)” 
(42(1 + o) + 2p} 


The percentage efficiencies of the estimates (8), (17), (18) and (19) 
are set out in Table 1. 

As might be expected the efficiencies fall off for low and high values 
of p. This is, however, of little consequence since for these values 
r'’* is in any case inaccurately determined whether an efficient estimate 
of p is used or not. 

Thus, for n = 7 if s is such that p = 0.1 or 0.7 the variance of r 
is about three times as great as when p = 0.4. In general high and low 
values of r indicate that the intervals between levels of fertilizer are 
too small and too great respectively for the accurate determination of 
curvature. 

The two series of estimates which tend to be fully efficient as p tends 
to 1 or 0 can, however, be expressed simply and may be useful occasion- 
ally. The first series, which is more efficient than the estimates pro- 
posed above for p greater than about 0.6, is given by substituting the 
values uw; — wy = @+1)(m —7 — 1)(nm — 2 — 2)/(m — 1)(n — 2) in 
Equation (10). The second series, for which 4; = + + 1, covers the 
range of p less than about 0.15. 


1/s 


SUMMARY 
Simple methods for fitting the regression equation 
y=a-— Bp, where 0<p<l 


are suggested for cases in which the ordinates x are equally spaced and 
take 4, 5, 6 or 7 values. 

The estimates of p are given by the ratios of two contrasts between 
the values of y. These contrasts are chosen so that the efficiencies of 
the proposed estimates are high over the range of useful values of p. 

Once p has been estimated a and # can be determined by a straight- 


forward linear regression. 
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SOME PROBLEMS OF EXPERIMENTAL DESIGN AND 
TECHNIQUE WITH PERENNIAL CROPS* 


8. C. Prarce 
East Malling Research Station, Kent, England 


Most statistical problems of experimentation with perennial plants 
arise from the longevity of the material, though a few are the result of 
the large size of the plants. Satisfactory classification is not easy, but 
here the problems will be considered in three groups. 

1. Problems in scientific approach arising because few experiments 
are possible in a reasonable period of time, and consequently the best use 
must be made of them. 

2. Problems in frial design and interpretation, mostly of a mathe- 
matical nature, concerning desirable developments in statistical tech- 
niques. 

3. Problems in the variability and measurement of plants, and the 
relations between various measurements. 


1. Scientific approach 


When experiments each last for several years it is not possible to 
do many of them; but there are sciences, geology and astronomy, for 
example, in which no experiments are possible at all, yet valuable 
advances in knowledge have been made. The difficulty of experimenting 
with long-lived plants does not mean therefore that the acquisition of 
scientific knowledge is impossible; but it does call for a consideration 
of the means available and a careful study of their limitations. 


*Paper presented to the International Biometric Symposium, Campinas, Brazil, on 5th July 1955. 
A somewhat longer version, similar in substance though not in detail, was kindly prepared by Senhor 
Ediliberto Amaral from an original by the writer, and entitled “Problemag especificos de delineamento 
e técnica experimental em culturas perenes’’. 
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The obvious alternative is to make good use of systematized obser- 
vations, as is done in the sciences mentioned. An astronomer cannot 
command an eclipse of the sun, but he can forecast when the next will 
take place and study that; in like manner, it might be thought, a research 
worker, unable to carry out a trial on, say, the irrigation of orange trees, 
might examine plantations in which irrigation had taken place and 
compare them with others that had not been so treated. With annuals 
this method has worked very well. It might be expected to work with 
trees also, but it must be noted that many additional factors are in- 
volved. Thus, it is not uncommon to find fields of annuals all of the 
same variety treated in much the same way except for the factor under 
study, the plants being necessarily of the same age; but plantations of 
perennials can differ in so many respects, such as age, rootstock, tree 
form and spacing, that in survey methods it is rarely possible to disen- 
tangle all the factors. 

The same consideration applies to extensive experiments carried out 
on a number of commercial plantations. If similar results are obtained 
at each site, the agreement is impressive, and any recommendation 
based upon it must be of wide application; but if some sites give dis- 
cordant results it is rarely possible to identify with certainty the factor 
responsible. 

Nevertheless, much knowledge of perennial plants has been obtained. 
Because of the difficulties of experimenting, however, much of it has 
come by other channels, and it is necessary to recognize this and to 
consider if these other channels can be improved. One valuable source 
of knowledge, for example, is the advisory officer. When visiting 
growers he receives as well as gives information, and in this way he 
accumulates evidence until it is clear to him either that a certain new 
method is an improvement or that opinion about it is so divided that 
it cannot represent an important advance. Such an approach can 
readily be systematized statistically; a group of growers selected at 
random from those with experience of the matter might be approached, 
and if their replies were inconclusive a further group might be asked, 
and so on until a result was obtained. Since much knowledge, especially 
about new varieties, does in fact come by this channel, it would be a 
good thing if it could be obtained more systematically and with proper 
statistical safeguards. 

Basically, however, the problem must be to improve experimenta- 
tion, and the best way of doing this is to have a clear purpose behind 
each trial. In any large set of treatments it is apparent that many 
comparisons can be made between them or between groups of them. 
With only six treatments, for example, there are 301, and by studying 


332 BIOMETRICS, SEPTEMBER 1956 


enough comparisons something significant can usually be found. In 
practice this does not matter, because no one does test separately all 
possible comparisons but only those that have an intelligible meaning; 
but to select these requires some knowledge outside that given by the 
trial itself. 

Reflection will show that this outside knowledge is used in three ways: 

a. by previous explanation. An experimenter by studying existing 
knowledge perceives certain laws or regularities, and wishes to test if 
these are coincidental or genuine by carrying out an experiment, de- 
signed to give a specified result if he has been thinking on the right lines, 
but not otherwise. 

b. by subsequent explanation. An experiment has been performed 
the result of which was not foreseen, but later consideration suggests 
an explanation. 

c. by further examination. A result has been obtained for which no 
explanation can be advanced. In view of its importance, if true, further 
work is done, and the same result appears again. 

Of these three, the last is mere empiricism, the gathering together 
of facts, not the understanding of the permanent elements that lie 
behind facts, which is the domain of science. With annual plants, 
where experiments are readily repeated, it provides a useful practical 
approach. With long-lived plants, on the contrary, where it is more 
difficult to accumulate facts by sheer weight of evidence, it is necessary 
always to seek an understanding of what is being observed. But this 
understanding should be sought as early as possible. The second of 
the ways mentioned above, obtaining a result and thinking up an explan- 
ation afterwards, is rarely convincing; if the explanation is complex it 
is judged to be over-ingenious and if simple to be belated. At the best 
it is a chancy method, and depends very much on the integrity of the 
experimenter. The most powerful experiments are those in the first 
class, for they do not only establish a fact but justify a system of thought. 
When Galileo dropped two balls from the Leaning Tower of Pisa 
he did not merely show that gravitational acceleration is independent of 
mass but he shattered the whole structure of Aristotelian physics. 
Where experiments are necessarily few and costly it is powerful ones 
like these that must be sought. 

Also, if the end of experimentation with perennial plants must 
always be comprehension, it is not enough merely to measure crop or 
whatever is immediately under study. It does not suffice to assert 
that in a particular trial Treatment A gave more crop than B without 
knowing why, for there will rarely be enough empirical evidence to 
establish the fact as a general rule, and its broad acceptance must 
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derive from the mechanism of the effect being inherently reasonable 
and of a sort likely to operate in a wide range of conditions. Conse- 
quently, the good experimenter will measure not only the weight of 
fruit, but also growth in its various aspects, disease, leaf colour, amount 
of blossom, and perhaps much else, so that his crop figures do not stand 
in isolation, but are seen as the culmination of a biological process, 
each part of which has taken place beneath a comprehending scrutiny. 


2. Trial design 


Problems in trial design that arise with perennial plants are not at 
the moment especially numerous, possibly because of the relatively 
large amount of work that has already been done. 

First, there is the need to develop row and column designs so that 
best use can be made of limited spaces. Perennial plants are mostly 
large, so if an area will take only seven rows of five trees there is room 
for no more than 35 plots; it is no use complaining that with six treat- 
ments 36 plots would be much more convenient. Further, if efficient 
use is to be made of the area, it is desirable to adopt a design that will 
allow for positional effects by the use of rows and columns rather than 
by blocks, for such designs will allow for the differences between outside 
and inside trees and so permit both to be used, subject to certain obvious 
restrictions. It happens that the evolution of such designs is formally 
part of a larger problem that at first sight is quite different, namely, 
that of changing the treatments on long-lived plant material. Whenever 
an experiment is concluded while the plants are still in their prime, as 
often happens, the possibility of using the same trees for further experi- 
ments must be examined. If the first trial is in randomized blocks, the 
problem is the same as that already mentioned, namely, that of adding 
a third classification to plots already classified in two mutually ortho- 
gonal ways (rows and columns in the one instance, blocks and original 
treatments in the other), and of doing so in such a manner that the 
new classes are compared to greatest advantage. Of course, the first 
trial is not necessarily in randomized blocks, so the problem of changing 
treatments is the wider one, and in its mathematical aspects includes 
that of using restricted areas economically. The complete solution of 
this problem must await the compilation of a comprehensive catalogue 
of balanced designs, but already considerable progress has been made. 
Thus, in the problem mentioned, i.e., applying six treatments to 35 
trees arranged seven by five, the modern experimenter has the resource 
of using a Latin square with a column added and a row omitted. 

The large size of most perennial plants suggests the use of parts of 
an organism, e.g., the branches of a tree, as plots of an experiment, the 
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word “organism” underlining the possibility of treatments so applied 
affecting other plots of the same block. Thus, in an experiment upon 
pollination some blossoms of a tree may be pollinated with Variety A 
and others with B; but it may be that the good set of the first will inhibit 
setting elsewhere, including those blossoms pollinated with B. A little 
reflection will show that given complete block designs there is no way 
of distinguishing the local effect, y, of a treatment upon the plots to 
which it is applied from the remote effect, 5, upon the plots elsewhere 
in the organism. The plots to receive it show, for example, ten per 
cent better set than those that have been treated otherwise, but is this 
because they have been improved, or the rest have been worsened, or 
something of each? The difference equals (y — 6) and there is no 
means of determining the individual values of the two parameters. If, 
however, incomplete block designs are used, the treatment effects shown 
by intra-block comparisons equal (y — 6) as before, but those from 
inter-block comparisons equal [y + (k — 1)6], where k is the number 
of plots to a block. This technique is already being used in an exploratory 
way at East Malling, where much of the mathematics has been worked 
out. Thus, experiments have been carried out on the effects of different 
kinds of pollen on fruit set and on the effects of different degrees of 
defoliation on fruit size. There is advantage both in the use of plots 
smaller than a whole tree and in the avoidance of doubtful assumptions 
about relationships within an organism. Further, isolation of the remote 
effects sometimes sheds light on the mechanism of the treatments under 
study. 

The last problem in this group that will be considered concerns the 
combining of several years’ results. It is dangerous to assume that the 
total crop over a period is all that matters, and equally dangerous to 
concentrate on single years in isolation; the pattern of bearing, whether 
the cropping tends to increase, or to decrease, or to fluctuate, or to be 
irregular or regular, may be important. To a large extent these charac- 
teristics are determined by the varieties themselves and the age of the 
trees, but they also may be affected by the treatments and it ‘is most 
important to know to what extent. Probably the best way of dealing 
with this problem is one associated with Brazil, namely, that suggested 
by Professor Stevens of Sao Paulo, in which various linear functions of 
annual yields are examined successively, each function being chosen to 
measure some characteristic of the pattern of cropping, such as the 
biennial swing, or the shape of the curve after this has been allowed for. 
The method can be used equally well for growth, incidence of disease, 
or anything else. One proviso may be mentioned here. The fitting 
of curves to past crops is a useful descriptive device; their extrapolation 
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to give information about future ones is quite another matter. The 
only satisfactory way of finding out about crop during twenty years 
is to grow the trees for that period. 

Valuable though this method is—it has incidentally the advantage 
of not being vitiated by the variability differing from year to year—it 
does not go far enough. For one thing, cyclic effects are not always 
biennial, and it has been suggested that for some crops, such as coffee, 
the period is between two and three years, and with forest trees much 
longer. Also, a biennial rhythm can be thrown out of phase by meteoro- 
logical factors. For another thing, the general shape of the curve after 
smoothing out cyclic effects is not easily expressed in any way that 
makes good biological sense. The general approach is no doubt sound, 
namely, the evolution of tests to supplement an analysis of variance on 
total crop borne over the whole period; but these tests, when obtained, 
may well prove more subtle than examination of linear functions. 
Here is a problem that needs much more thought than it has received 
up to the present. 


3. Variability and measurement 


Although the second group of problems may be said to be few and 
simple, those of the last group are numerous and complex, and also 
have to be considered anew for each species. 

One of the first importance is the determination of sources of error, 
because this may largely affect the design. Variation can come from 
two main sources, the trees themselves or positional effects, and when- 
ever a trial is proposed it is necessary to be clear if one or both of these 
sources needs to be considered, otherwise a lot of trouble can be taken 
controlling variation that is negligible anyway. 

In particular, if most variation arises in the plants themselves there 
will be a great need to find characteristics apparent at planting time 
that can be used to give worth-while adjustments by the method of 
covariance. There would not, however, be much gained by attempts to 
control positional effects, and any complications of design introduced 
for this purpose would probably on the whole do more harm than good. 
On the other hand, if the variation is mostly positional, devices such as 
confounding and incomplete blocks may be valuable, and it would also 
be worth-while to go to the trouble of using many small plots rather 
than a few large ones. 

When trees come from the nursery for planting it is advisable to 
put them into their permanent places at random. Where this is done 
the initial variation is associated with the trees, and has no positional 
elements in it. Then, when the trees begin to grow, further variation 
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is introduced on account of differences in growth rate induced either by 
position, or by genetical and other factors inherent in the trees. In 
fact, some positional variation may well arise, but it may be negligible 
or it may dominate according to circumstances. 

The ideal is, no doubt, to use plant material of the greatest uni- 
formity—and it is impossible to stress too much the need for good, 
standardized plants—and also to use the smallest blocks possible so as 
to minimize positional variation. However, a certain sense of proportion 
is called for. Errors combine in such a way that reduction of the smaller 
component is rarely worth bothering about; it is always the larger 
that must be tackled. But which, for any species in any locality, is the 
larger? So often no one has ever thought to enquire, but it is a basic 
problem. Of course, initially the variation always comes from the 
trees; the question should be put ‘‘Will the positional effects come to 
matter?”’. 

Other biometric problems are those of measurement and the rela- 
tionships between measurements. When discussing questions of scien- 
tific approach to investigations with perennial plants it was suggested 
that a comprehending scrutiny should be maintained; it is all very well 
to say that, but how is it to be done? For example, the activities of a 
woody plant depend upon its ability-to form a structure from which 
leaves can be displayed to the sun’s rays, but has anyone ever found 
out how to measure this extension growth rapidly but effectively? On 
the other hand, the trunk circumference can be measured easily enough, 
but does anyone know what it means? In general, of course, the thicker 
the trunk the larger the tree; but with apples, and probably with other 
species also, severe pruning leads to growth which at first finds no reflec- 
tion in the trunk. Also, in apples, the relationship of tree weight to 
trunk circumference depends upon the rootstock; so evidently the 
measurement must be used with some circumspection in trials of 
pruning methods and of rootstocks. Again, growth is apparently 
inhibited by cropping, which is related to blossoming, but is it the crop 
or the presence of fruit buds that inhibits the growth? There is some 
evidence that it is the blossoms that check the growth; but comprehen- 
sion of what is happening in a tree is impossible until fairly clear answers 
have been found to this and many similar questions. Admittedly 
these are physiological problems also, but the statistician with his 
methods of regression analysis and variance components has no excuse 
for idleness either. It is not too much to say that inadequacy of measur- 
ing techniques and the ignorance of relationships between measure- 
ments are the chief weaknesses in present-day experimentation with 
perennial plants, and ones that present a continual challenge to the 
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statistician. It has been suggested in the first section that an experiment 
must test a way of thought, that it must be a seeking of comprehension 
of the factors behind phenomena—that empiricism is not enough; but 
this is futile if no means are at hand for exploring what goes on inside 
a tree without killing it to find out. The need all the time is to wnder- 
stand, and this we often cannot do because we cannot measure. 

In putting forward these problems for consideration no attempt has 
been made at comprehensiveness, although those advanced are perhaps 
fairly representative of the range and variety of problems to be en- 
countered. Their enumeration will have served a useful purpose if it~ 
does no more than show that experimentation with perennials is a 
branch of statistics with a number of problems of its own, some of which. 
have received far too little attention. 


QUERIES 


GeorGE W. SNEDECOR, Editor 


QUERY: [I would appreciate your advice regarding the statistical 
122 analysis of the following problem. The problem concerns the 

length of gestation period in cattle and the data are classified 
’ according to three factors (1) sex of the calf born (2) the pregnancy 
order of the dam (3) month of calving. This is then a 2 X R X C Table 
where R = 7 and C = 12 in the present data. The data are coded by 
subtracting 260 days from each gestation period (the range was 264 
to 296 days). The number of observations in the cells of the table 
varied from 0 to 19, there being 15 cells with no observation. Values 
were calculated (to the nearest whole number) for these 15 cells from 
the means for sex, pregnancy number and month of calving on the 
assumption that there was no interaction between the various factors. 

The information I wish from the data is as follows: 


. The difference in gestation length because of the sex of the calf. 
Differences in gestation length due to pregnancy order. 

. Seasonal differences in gestation length. 

. The presence or absence of interactions between these factors. 

. Estimates of the proportions of total variance attributable to these 
factors. 


Had no cells been missing in your 2 X 7 X 12 table I 
_ ANSWER: would have recommended an unweighted analysis of 

the cell means except for intracell variation although 
the number of observations per cell varies considerably. Missing cells, 
however, and particularly as many as 15 missing cells, necessitate a 
least squares analysis in order to obtain the itemized information, if 
the full classification of each factor is to be maintained. But the in- 
clusion of interactions in the model makes this method too laborious 
and impractical, if not impossible. 

A feasible alternative, when one or more of the classifications of a 
factor can be logically grouped together so that no cells are missing, 
is to form a new 2 X Ri’ XK C’ (R’ < R and/or C’ < C) table with no 
missing cells. This procedure should be applicable to your data since the 
missing cells most likely come from having few observations on dams 
with high pregnancy orders (5, 6 and 7). It seems probable, with only 
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15 cells missing out of 168 as you now have the data classified, that 
grouping pregnancy orders 6 and 7 together or at most 5, 6 and 7 
together would leave no missing cells. Seasonal calving would lead to 
few observations in certain months and this could be overcome by 
grouping the months into seasons. However seasonal calving is not 
generally regulated to the extent that it is expected to be a prominent 
factor in disproportionality. 

Having grouped the classifications so that no cells are missing the 
usual analysis of variance of the cell means can be performed. The 
expectations of the mean squares are those for a single observation in 
each cell except that the coefficient of the within cell variance is the 
reciprocal of the harmonic mean number of observations in the cells. 
The within cell variance is computed separately and pooled over cells. 

The information you wish through item 4 is obtained from the above 
analysis of variance or by direct comparisons of unweighted means. 
For item 5, to estimate the proportions of total variance attributable to 
these factors, it must be remembered that the model corresponds to a 
fixed one in that the classifications cover the full range of effects for sex 
and month of calving and for all practical purposes for pregnancy order. 
However, I can see no practical harm in assuming a completely random 
model and to apportion the total variance to the various sources by this 
method. 

The grouping of adjacent classes where an order is defined by the 
classification such as in pregnancy order reduces the variances stemming 
from this source. However, when the classes grouped are not very 
different, and in the case of pregnancy order, the orders 5, 6 and 7 are 
usually about the same for most characters, the reduction in variance 
should be slight, if any. 


C. CLtarkK CocKERHAM 


CORRECTION TO “FRACTIONAL REPLICATION FOR 
MIXED SERIES” 


Mitton Morrison 
Experimental Towing Tank, Stevens Institute of Technology 


Dr. K. A. Brownlee has pointed out that there is a slip on page 11 
of my paper (this journal, Vol. 12, 1956, pp. 1-19). The degrees of 
freedom for E, AE, BE, CE and DE should each be 3, and not 2. Those 
for the residual should be correspondingly reduced from 11 to 6. 


NEWS AND ANNOUNCEMENTS 


Members are invited to transmit to their National or Regional Secretary 
(if members at large, to the General Secretary) news of appointments, 
distinctions or retirements and announcements of professional interest. 


DR. JOHN WISHART 


Dr. John Wishart, Reader in Statistics at Cambridge University, 
drowned by misadventure while swimming from El Revolcadero beach, 
Port of Acapulco, Mexico on July 14th, 1956. Dr. Wishart was dis- 
tinguished for his theoretical contributions, especially in respect of 
the distribution of moment and product moment statistics, and for his 
applied work in agricultural experimentation. He was a charter member 
of the Biometric Society. He had been in Mexico since January, 1956 
in connection with the Centro Interamericano de Capacitation sobre 
el Uso de Metodos Estadisticos en la Experimentation Agricola, spon- 
sored by the Government of Mexico, and by FAO. 


Professor Ralph A. Bradley was the recipient of the Brumbaugh 
Award of the American Society for Quality Control, given annually 
to the author of the article published in Industrial Quality Control, 
judged to have made the greatest contribution to the development of 
industrial applications of quality control. 


Under an agreement into which the Johns Hopkins University 
entered some years ago, Professor W. G. Cochran was visiting professor 
at the University of the Philippines during the months of June and 
July 1956. 


The eminent distinction of the Copley Medal of the Royal Society 
of London was awarded to Professor Sir Ronald A. Fisher, F.R.S., 
“in recognition of his distinguished contributions to developing the 


theory and application of statistics for making quantitative a vast 
field of biology”’. 


; Dr. C. H. Goulden, previously awarded the Gold Medal of the 

Professional Institute of the Public Service of Canada for his scientific 
contributions, is now Director, Experimental Farms Service, Canada 
Department of Agriculture. 


Professor George W. Snedecor’s many outstanding contributions to 
statistics were recognized when a honorary Doctorate of Laws was 
conferred on him by the University of North Carolina, May 24, 1956. 
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THE BIOMETRIC SOCIETY 
International A ffairs 


The Society has been asked to organize three sessions for the meeting 
of the International Statistical Institute to be held in Stockholm from 
August 8 to 15, 1957. Plans are also being made for the 4th Inter- 
national Biometric Conference to be held in Ottawa during 1958, 
probably from August 28 to September 1. Details of these meetings 
will be sent to members as soon as available. 

Biometric Seminars are being held in Linz from September 24 to 
October 3, 1956, and in Milan from October 8 to October 20. 


Switzerland 


The Swiss section of the Society held its annual meeting on July 
14, 1956 in Zurich. The programme included communications by H. L. 
LeRoy: Die Inzuchttheorie von R. A. Fisher; P. Geier: Les rapports 
numériques entre un Lépidoptére nuisible aux cultures fruitiéres et 
certains de ses antagonistes. 

At this meeting, Dr. H. L. LeRoy was elected National Secretary 
in place of Professor A. Linder. 


ENAR 


A joint meeting with the American Institute of Biological Sciences 
was held at Storrs, Conn. on August 28-29. The programme included 
the following papers:— 

Joint session with the Ecological Society of America—L. C. Cole: 
Some examples of the ecologist’s problems; C. I. Bliss: Some over- 
dispersed distributions; P. J. Clark: Nearest neighbour methods in the 
analysis of spatial pattern. 

Session of contributed papers—C. White: Distribution of erythro- 
cytes on the floor of the hemocytometer chamber; I’. E. Satterthwaite: 
Random assignment experimental designs. 

Joint sessions with the American Society for Horticultural Science— 
R. L. Wine: On the comparison of multiple test procedures; C. W. Dunett: 
Multiple decision procedures. 

A joint meeting with the American Statistical Association was held 
in Detroit, Mich. from September 7-10. The programme included:— 

*Covariance Analysis—D. B. deLury: Elements of covariance; 
W. T. Federer: Covariance analysis with unequal subclass numbers; 
M. Zelen: The analysis of covariance for incomplete block designs; 
H. O. Hartley; Group comparisons and analysis of variance and covari- 


ance in cluster sampling. 
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Animal Feeding Experiments—G. C. Ashton: Control of error 
variance in swine feeding experiments. 

Screening problems in industrial biology—C. W. Dunnett: The 
Pharmaceutical Industry; M. B. Carroll: The Food Industry; H. Smith, 
jr: Biochemical Research. 

*Applications of stochastic processes—D. W. Alling: The after- 
history of pulmonary tuberculosis; A. F. Bartholomay: The kinetics 
of enzyme action. 

Biological standardization—I. A. DeArmon: Control of precision 
in the plate count assay; B. W. Brown: More sensitive vaccine screening 
procedures; J. Ipsen: Assays of immunizing agents; M. C. Sheps: 
Between-assay error in biological assays. 

The sessions marked * were co-sponsored by the Institute of Mathe- 
matical Statistics. 


German Region 


A Biometric Colloquium at Hanover is planned for September 
10th, with the following programme—A. Lein: Anwendungsméglich- 
keiten der Biometrie in der Landwirtschaft; M. Wermke: Die Ver- 
wendung von Lochkartenmaschinen zur Auswertung von Feldversuchen; 
W. Seyffert: Versuchsauswertung nach Messung mehrerer Merkmale; 
H. Rundfelt: Zur Rationalisierung der Pflanzenziichtung. 


WNAR 


The Annual Meeting was held at Seattle on August 22-24. Included 
in the programme was an invited address by Prof. N. Rashevsky, 
entitled ““Models and General Mathematical Principles in Biology and 
Sociology”’. 

Other papers given at the meeting included:—L. Moses: Some 
non-parametric techniques; W. Becker: The log transformation of 
growth data; D. Jenden: Reaction rates in geometrically constrained 
enzyme systems; H. Hotelling: New light on the multiple correlation 
coefficient; P. Horst: Optimal estimates of multiple criteria with restric- 
tions on the covariance matrix of estimated criteria; V. Miller: Procedural 
consideration in forecasting population; F. Johnson: The role of sta- 
~ tistical methods in forest and range research; T. Orr: Some notes on 
the growth of partially cut stands; J. H. G. Smith and J. W. Ker: 
Some distributions encountered in sampling of forest stands; B. Wagg: 
The relationship between the growth of the Spruce budworm and 
certain environmental factors; M. Ahmed: A stochastic model for the 
tunneling and re-tunneling of flour beetles; E. R. Rich: A stochastic 
model for the number of beetles on the surface of flour. 


BIOMETRICS—DIRECT INCOME AND EXPENSE—1955 


(Final Report, March 29, 1956) 
Income 
(1) Reserve (1951 thru 1954) sale 3,1 849.00 
Reserve (1951 thru 1954) sale 7,1 756.00 


Balance from 1954 books 6387 .34 
7992.34 ' 
(2) Member subscriptions 
ASA 443 at 4.00 1772.00 
Bio. Soc. 573 at 4.00 
680 at 2.75 
6 at 25.00 4312.00 
6084.00 
(3) Non-members subscriptions (690) 4835.75 
(4) Sale of back issues 
Vol. 3, No. 1 106 at 1.50 159.50 
Vol. 7, No.1 23 at 2.00 46.00 
Others | 2038 .47 
bef? . 2243.97 
(5) Sale of reprints 998 .67 
(6) IUBS (Proceedings Brazil — 
Symposium) : 500.00 
oar 22654.73 
(1) Overpayment and cancelation 
of subscriptions _ et maa" oe. 2000 
(2) Joseph Ruzicka, ‘bookbinding PE) P45, 
(3) Express charges 598.62 
(4) Inst. of Stat. editorial manage- 
> , ment 1000. 00 
Bea -10) spp 230.69 a 
Post 7 300.00 
2 500.00 a 
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THE STATISTICAL ANALYSIS OF A COMPLEX EXPERIMENT 
INVOLVING UNINTENTIONAL CONSTRAINTS 


D. J. FINNEY 
A.R.C. Unit of Statistics and Department of Statistics, 
University of Aberdeen, Scotland 
AND 
F. W. Corr 
Imperial College of Tropical Agriculture, Trinidad 


1. INTRODUCTION 


The analysis of variance is now widely known, and so thoroughly 
explained in many text books that it can be used regularly by large 
numbers of research workers who have had no special training in sta- 
tistics. Nevertheless, various techniques useful in special circumstances 
are much less familiar, though perhaps obvious to a mathematical 
statistician. Experiments that do not go exactly according to plan, 
either because of flaws in design or because of unsuspected trends and 
sources of variation, often provide illustration of the value of several 
such techniques in a single analysis. 

This paper discusses the analysis of yields from a complex factorial 
experiment on cacao. Before planting of the experimental area had 
been completed, a flaw in the design was discovered. Advantage was 
taken of this discovery to modify both the treatments and the design; 
certain constraints of the original design had already been introduced, 
however, and in any final analysis of the experiment account ought to be 
taken of these. Even without this complication, the statistical analysis 
of the experiment presents interesting features, and here these will be 
illustrated with the aid of the yields of cacao in 1954-55. This paper is 
not intended as a definitive account of the interpretation of the experi- 
ment, which,: since cacao is an orchard crop, must continue for many 
years before adequate evaluation of the experimental treatments is 


possible. 


Extent of computations 


The reader may perhaps be intimidated by the amount of heavy 
computation involved. The experiment is large and complicated, so 
that the statistical analysis is inevitably laborious. However, he should 
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remember that such extensive computation is not always necessary in 
a complex experiment. But for the flaw in the original design, only the 
calculations of Sections 3 and 4 would have been required; moreover, as 
shown in Section 4, the rather tedious evaluation of many different 
standard errors can be replaced by a simple and rapidly obtained 
approximation that is good enough for most purposes. Although for 
completeness the formation of summary tables has been fully described, 
in practice not all of these might be wanted. Once the method of 
analysis has been outlined, it can be applied to yields from successive 
years or sequences of years without difficulty, since the formulae for 
the various quantities remain unaltered and only numerical values 
have to be changed. Two or three days of computation annually, to 
a routine pattern, does not seem extravagant for an experiment of this 
size. 

Section 5 contains the most laborious calculations, but these are 
the price that must be paid for the non-orthogonality of design. An 
experienced statistician would perhaps have guessed from inspection 
of the yields and a few trial calculations that, for the 1954-55 crop, 
the complications of Section 5 could be ignored without serious harm, 
although it is difficult to be sure of this without the full analysis as 
given. The greater part of the work of Section 5, the inversion of a 
matrix, has to be done only once and is then available (Table 8) for 
any set of yields from this experiment that may require analysis in the 
future. The calculations described in Section 5 took one of us many 
hours, but this phase in the analysis of yields from another year of the 
experiment might not need more than one hour. 


The experiment 


In 1949, it was agreed that the Soil Science and Chemistry Section 
of the Cocoa Research Scheme (which is administered by the Imperial 
College of Tropical Agriculture) should lay down an experiment at 
River Estate, Trinidad, in order to investigate the effects of trenching 
(the burying of waste vegetation in shallow trenches) on the growth 
and bearing of cacao trees, in comparison with surface applications of 
the same materials or of bagasse (cellulosic waste from sugar mills). 
The underlying idea was to attempt to find the most expedient method 
of reinstating the normal organic profile in cacao soils; considerable 
evidence is adducible that the gradual decline in productivity of Trinidad 
cacao-producing soils is largely due to the loss of organic matter, coupled 
with a deterioration in the biological condition of the soil consequent 


upon diminution in the supply of leaf litter (Havord [1951]; Havord 
et al., [1953, 1954)). 
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Study of any interaction between spacing of trees and soil ameliora- 
tion by the trenching and mulching treatments seemed desirable. 
Furthermore, in order to increase the generality of any recommendations 
arising from the experiment, the inclusion of four clones of cacao (propa- 
gated as cuttings) was agreed. 


2. THE DESIGN 
The original plaid square 


When the experiment was first designed (by one of us (F.W.C.) 
and Dr. K. 8. Dodds), the Soil Science and Chemistry Section was 
thought to intend including the trenching comparison on a factorial 
basis, in combination with the two mulching materials, bagasse and 
cut bush. Hence the designers contemplated eran ae ficiorial design 
on the factors: 


S: Spacing (Si = 7.2 ft square, S: = 9 ft square); 
M: Mulch (M, = bagasse, Jf, = cut bush); 

T: Trenching (7; = untrenched, T2 seen); 
V: Clone (V1, V2 , V3 , V:). 


It was then regarded as essential that the mulched plots should run in 
strips, in order to facilitate applications of mulches several times a year. 


The spacing treatments were also to be arranged in strips, so that border 
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effects:would be minimized. An 8 X 8 plaid square design was therefore 
proposed, in which M and certain interactions would be confounded 
with columns, S and other interactions confounded with rows. 

The design (Fig. 1) was constructed by taking a plan appropriate 
to an 8 X 8 plaid square, assigning four randomly selected columns to 
M, and the others to M, , also assigning four randomly selected rows 
to S, and the others to S, , and then randomizing the arrangement. 
Planting was begun in 1949 and completed in 1950. During the second 
season’s planting, the non-orthogonality of the field plan was discovered; 
in particular, it contains three plots of each of the combinations 
S,M,T,V; , SoM.T,V; , S.M,T.V. , SiM.T.V, and only one each of 
S,M,T.V,, 8:M,T,V; , S.M.T,V; , S:M.T.V3; . How the basic plan 
evolved into a non-orthogonal design remains to this day a mystery. 


The final partially confounded design 


Up to this time, mulching and trenching had not begun, so that only 
factors S and V had been introduced. One possibility then was to 
transplant a sufficient number of plots so as to produce a correct design 
by interchange of clones. Though this was physically possible, the 
authors had little hesitation in advising against a procedure that would 
have introduced an unpredictable and possibly heavy mortality on a 
few plots; replacement of dead plants would have produced plots of 
mixed ages, since the original cuttings were then in their second year, 
and serious biases in eventual comparisons between treatments might 
have resulted. 

The alternative adopted was to ‘degrade’ the plaid square into a 
series of eight randomized blocks formed by the rows, with S confounded 
between blocks. If the requirement that strips of eight plots should 
have the same mulching treatment were abandoned, the remaining 
treatments could then be introduced so as to confound unimportant 
interactions. By chance, the top and bottom halves of the experiment 
contained equal numbers of S, and S, rows and so could be regarded as 
replicates. A relic of the original design would remain, in the constraint 
that each clone would appear twice in each column, and special steps 
might later be needed in order to take account of this. 

At this-stage, the Soil Science and Chemistry Section stated a 
preference for a non-factorial set of soil treatments instead of M and T. 
The practical difficulties of digging a trench large enough to bury the 
amounts of mulch material applied annually to the surface suggested 
replacement of this form of trenching by a practice previously found 
beneficial on this estate. This involved opening shallow trenches only 
every fourth year, burying any organic materials at hand, and covering. 
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<<)... __ |... | Block: 
Se Vi Mi Vill, V3sMe VeMs VeMe ViM, V3sM3 VaMy ued towel 
25 551 | 611 1035 1444 785 663 573 I: 6,087 
S1 ViMe | VaMs | Vem, | V3sM4 ViMs V3Mi VoM4 VaMe 
904 | g92 49 527 821 447 545 1720 II: 6,705 
= S2 V3Ms Vills | VaMe VoMs | VoMi ViM3 ViMe VsM1 
3 663 642 846 623 | 694 976 1149 1095 III:6,688 
Re : oe E pe 
= Si VeMs3 V3sM2 ViMs Vim V3M3 ViMys VaMi, VoM2 
2 1087 1333 1061 904 810 606 1267 1976 IV: 9,044 
= Si VoM4 VsMe V3Mi1 VaMy ViMs VoMs3 V3M2 ViMs3 
3 717 1198 653 1154 1108 1283 776 1144 V:8,033 
a | 
Be Sk V3Mz | V2Me2 Vim ViMs VsM3 V3M4 ViMe2 Vom 
¥o 761 | 1482 795 1494 1391 1027 668 1071 VI: 8,689 
S2 ViMs Vo2My V2Me MiMi V3M4 ViMe VsaM3 V3M3 
462 | 1134 1244 553 1004 960 1697 1209 VII: 8,263 
S2 VaMi | V3Mi ViMs V3M2 ViMe VoMs V2M3 ViMs 
1237 | 534 439 1734 1897 294 1532 453 VIII: 8,120 
| 
FIGURE 2 


Revised (1950) design and yields of dry cacao in 1954-55, in lb. per acre. For 
explanation of symbols, see text. 


The arrangement finally adopted was that shown in Fig. 2; it has a 
2 X 4’ set of factors, where the symbols indicate 


S: Spacing (S: = 7.2 ft square, S: = 9 ft square) 

M: Mulch (M, = bagasse mulch, annually; 
M, = cut bush mulch, annually; 
M3 = burial of organic matter in trenches, every fourth year; 
M, = no treatment) 

V: Clone (Vi, V2, Vs, Vs = cuttings of ICS 1, 6, 8, 60 respectively). 


As far as was practicable, chemical equivalence of M, , M, , WV; was 
aimed at, and the organic plant materials of M; are therefore pouee 
fortified maith pen manure. - 
The experiment now consists of two replicates of the 32 feaemane 
combinations, in eight blocks of eight plots. The effect of spacing and 
certain components of the VM and SVM interactions are confounded. 
The scheme of confounding can best be explained after a formal replace- 
ment of V and M by pairs of quasi-factors each at two levels; thus the 
four clones are made to correspond to the four combinations of two 
factors A and B and the four types of mulching to the four combinations 
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. 5 . 
of two factors C and D. The experiment can now be regarded as a 2” in 
two replicates, where the correspondence of main effects is 


S = S, — S, (for convenience, since S; shows the higher yields), 


A= -V; + V2 —Vs + Va, 
B= — Vi Va Vane) Vas 
AMES SANG) Vs WS Wen 
C= —-M,+ M.—-M:+M™,, 
D= —-M, -—-M.,.+M;+ Mz, 
CD = M, —M.,—M3;+ ™,. 


If the treatments on the 64 plots are rewritten in terms of the factors 
S, A, B, C, D, it becomes apparent that S, BD, and SBD are con- 
founded in the first replicate and S, ABCD, and SABCD are confounded 
in the second. In fact, five mutually orthogonal block contrasts can 
be identified as follows: 


The contrast of (II + IV + V + VI) with (I + III + VII + VIII) estimates S, 
the contrast of (I + IV) with (II + III) estimates ABCD, 

the contrast of (III + IV) with (I + II) estimates SA BCD, 

the contrast of (VI + VII) with (V + VIII) estimates BD, 

the contrast of (VI + VIII) with (V + VII) estimates SBD. 


There are, of course, two further contrasts between blocks mutually 
orthogonal and orthogonal with these five, so completing the set of 7 
d.f. These can be identified with the difference between replicates 
and the interaction of S with replicates. 


Plot size 


The gross plot size is 36 ft X 36 ft. Each plot of S, carries 16 
experimental trees, and each plot of S, carries 9. There are guard 
rows (ICS 1) between plots, those within blocks having trees at the 
same spacing as the experimental trees of the blocks and those between 
blocks having a spacing that conforms to the block above it in Fig. 2. 
Thus the net plot sizes, for experimental trees only, are 28.8 ft X 28.8 
(ttor Sye2eth x 20: 4b for S, . 


3. ANALYSIS OF VARIANCE 


Orthogonal contrasts 


Fig. 2 also shows yields per plot for 1954-55, expressed as lb. of dry — 
cacao per acre, of course based upon calculations from net plot sizes. 
Naturally at this early date no adequate analysis and interpretation of . 
yields is possible, but a consideration of the technique of analysis is of 
interest as an illustration of statistical methodology. In the first 
instance, the column constraints residual from the earlier design will 
be ignored and the experiment regarded simply as a 2 X 4? factorial 
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Treatment Symbol 


TABLE 1 


CALCULATION OF ORTHOGONAL ConTRASTS 
| i aa ee 


Sum of 
replicates 
(ii) (iii) 
il 978 
8 1699 
a 1828 
sa 1920 
b 1629 
sb 1100 
ab 1810 


Total 


(iv) 


61629 


Diff. of 
replicates 
(‘2’ = BTR) 
(v) 


128 
= 
440 
222 
— 561 
206 
664 


Total 


351 


Contrast 
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with partial confounding; further consideration of these constraints 
is deferred to §5. The first stage in the analysis is the forma- 
tion of the analysis of variance. This is most readily achieved by use 
of the formal representation of the design as a 2° with the factors 
S, A, B, C, D, followed by re-interpretation in terms of the true factors. 

Table 1 summarizes the calculation of orthogonal contrasts between 
the 64 yields for deriving main effects, interactions, and components of 
error by Yates’s method of systematic additions and subtractions 
(Yates [1937]). Columns (iii) and (iv) of this table are obtained by 
adding, from the two replicates, yields of pairs of plots with the same 
treatment and then performing the five successive series of additions 
and subtractions required by the Yates procedure; only the first and 
final columns of this calculation are shown. In columns (vy) and (vi), 
differences between pairs of similarly treated plots are analysed in the 
same way. Thus the two ‘totals’ columns contain the treatment 
contrasts and the interactions of these with replicates. Up to this 
stage, no account has been taken of the confounding. 


Sums of squares 


The analysis of variance, Table 2, can now be completed, the only 
complication being that of the partially confounded contrasts. For 
example, the component for replications is the square for ‘sum’ in column 
(vi) of Table 1: 


4581? 
64 


= 327,899. 


So the sum of squares for clones is formed from A, B, and AB in column 
(iv) as 


(9691° + 3365° + 85°) + 64 = 1,644,468; 


the unconfounded 7 d.f. for the VM interaction have a sum of squares 
formed similarly from AC, BC, ABC, AD, ABD, ACD, BCD. 

The totally confounded contrast for spacing requires no modification 
from the value in Table 1, and the square, 


3313" 
64 


= 171,500, 


appears in the inter-block section of Table 2. The partially confounded 
contrasts require special calculation, since each has an intra-block 
portion calculated from one replicate and an inter-block portion from 


——— 
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TABLE 2 
ANALYSIS OF VARIANCE 


Variation due to IDEs Sum of squares Mean square 
Replications 1 327 , 899 
S 1 171,500 
VM 2 114,345 
SVM 2 278,337 
Error (inter-block) il 108,488 
Inter-block 7 1,000 , 569 ; 
V 3 1,644,468 548, 156 
SV 3 358,191 119,397 
M 3 2,368,114 789 371 
SM J 3 pba: 200 , 790 66 , 930 
VM (unconfounded) eT cd 81,398 
(intra-block BD and 2 89,551 
ABCD) 5 ——— 
SVM (unconfounded) ve ae a 54,802 
(intra-block SBD and 2 25,821 
SABCD) | 
Error ee Bock) 25 2 1928, 948 112,652 
63 Age 726, 885 


= 


the oS ‘These could be formed ie new oma otc 
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and the inter-block is 
[3159 — (—317)] + 2 = 1738. 
Hence the intra-block sum of squares for BD and ABCD is 
(9207 + 1421”) + 32 = 89,551, 
and the inter-block sum of squares is 
(799? -- 1738”) + 32 = 114,345. 


The sum of squares for error can be computed from the totals in 
column (vi) of Table 1, by summation of the squares for all the con- 
trasts except those for replicates, S X replicates, and the four involved 
in the partial confounding. The sum of the five inter-block components 
in Table 2 can then be checked against a sum of squares formed directly 
from the eight block totals, and the sum of squares for the whole table 
can be checked against a total computed from the individual plot yields. 


Tests of significance 


For preliminary tests of significance, mean squares for main effects 
and interactions in Table 2 can be compared with appropriate error 
mean squares. The intra-block portion of BD and ABCD of course 
involves the same error as the remaining 7 degrees of freedom for this 
interaction, and only a single mean square need be shown for VM. In 
the intra-block section of the analysis, the mean squares for clones and 
for mulches are significantly greater than the error mean square. None 
of the interaction mean squares is statistically significant by comparison 
with the error, but, since some of them are based on several degrees 
of freedom, they will be worth closer examination when tables of means 
are drawn up (§4). The inter-block section of the analysis is chiefly 
of interest for the information it might give on spacing. No test of 
significance is much use, since only 1 degree of freedom for the true 
error is available and even pooling interactions with this adds only 4 
degrees of freedom. However, the mean square for spacing is not much 
larger than either the inter-block error or a mean square formed by 
inclusion of the interactions; these two are almost the same as the 
intra-block error, so suggesting that there is very little additional error 
between blocks. Although no useful standard error for the difference 
between spacings can be quoted, any such difference is evidently too 
small to have been proved significant by an experiment of this size 
even if the contrast were unconfounded, despite the fact (noted below) 


that the difference is estimated by the experiment as about ten per cent 
of the mean yield. 
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or 


4. SUMMARY TABLES 


Rules for construction 


In any factorial experiment, it is usually desirable so to tabulate 
mean yields as to show all two-factor interactions, even when these are 
not statistically significant. Even though an interaction is not large 
enough for a confident assertion that it differs from zero to be made on 
the evidence of the one experiment, the values may be wanted for 
comparison with results from other experiments. 

For the interactions of spacing with clones and of spacing with 
mulches, no problems of confounding arise, and mean yields can be 
computed by direct averaging of individual plot values. An alternative 
process that is rather quicker in a large experiment involves further 
applications of Yates’s scheme of additions and subtractions. The 
rule for obtaining mean yields for all combinations of any set of factors 
is as follows:— 


(i) Take from column (iv) of Table 1 the values for the sum and all main 
effects and interactions of the factors to be studied; write these in a column 
in the reverse of Yates’s standard order. 

(ii) Perform additions and subtractions on these according to Yates’s scheme, 
starting from the top of the column in the ordinary manner. 

(iii) Divide each entry in the final column by 64 (more generally, by the number 
of plots in the experiment). 

(iv) Read the results as the mean yields for the combinations of the factors in 
the order-in which these combinations are introduced in Table 1 but reading 
from the foot of the column. 


Thus, in order to obtain the means for the four clones, the means for 
all combinations of factors A, B are required. The grand total and the 
effects A, B, AB are written in a column in the reverse of the standard 
order and the Yates calculations completed:— 


Contrast Total (i) (ii) +64 Mean 
AB —85 3280 74600 1166 ab, or V4 
B 3365 71320 55388 865 b, or Vs 
A 9691 3450 68040 1063 a, or V2 
Sum 61629 51938 48488 758 (1), or Vi 


Yates (1937) described this process also, but it appears to be less widely © 
known than his process for forming the contrasts. Division by 64 
then gives the mean yields for V, , V2, V3, Vs. In this way, the 
entries in Tables 3 and 4 have been constructed. The standard errors 
for these tables are based entirely on the intra-block error mean square, 
and are applicable to comparisons of the main effects of V and M 
(16-fold replication) or to the interactions (8-fold replication). As 
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has already been seen, no useful estimate of inter-block error is available, 
so that comparisons between wide and narrow spacing cannot be 
referred to any standard error; there is some indication of an appreciable 
(ten per cent) improvement in yield at the narrower spacing, but little 


TABLE 3 


Mean YIELDS FOR CLONES AND SPACING 
(lb. dry cacao per acre) 


ICS clone Mean 
Spacing 1 6 8 60 
GY 869 1126 792 1272 1015 
Q’ 646 1000 939 1059 911 
Mean (+84) 758 1063 865 1166 963 


S.E. for interactions: 119 lb. per acre 
There is no estimate of S.E. for vertical differences. 


TABLE 4 


Mean YIELDS FoR MuLcHES AND SPACING 
(lb. dry cacao per acre) 


Mulch 
Spacing Bagasse Cut bush Trenching None Mean 
eee 892 1257 1024 886 1015 
9’ 781 1236 1026 603 911 
Mean (-84) 837 1246 1025 744 963 


§8.E. for interactions: +119 lb. per acre 
There is no estimate of S.E. for vertical differences. 


confidence can be placed in this. Evidently clones V, and V, (ICS 6 
and ICS 60) have yielded substantially better than the other two, 
differences within either pair being less noticeable. Mulching with cut 
bush was the most successful of the four treatments, its superiority 
to trenched material just failing to reach statistical significance (cf. §5); 
mulching with bagasse, on the other hand, was not significantly better 
than no treatment. The sums of squares for the two interactions with 
spacing were not sufficiently large to give statistical significance even 


Peiagee > ~mng he a+ 
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if either of these were concentrated in a single degree of freedom, so that 
not surprisingly no striking interactions are displayed in Tables 2 and 3. 
There is some indication that V; (ICS 8) may be relatively more suc- 
cessful at wide spacing than are the other clones. 


A partially confounded interaction 


Perhaps the most interesting interaction is the VM. In order to 
prepare a table to display this, each partially confounded component 
of interaction mu be adjusted to a value based on the replicate in 
which it is unconfounded. The calculations (Table 5) may be per- 
formed as described above. However, the value for the BD contrast 
in Table 1, namely 1719, must be replaced by the value based upon the 
second replicate alone; this was shown in §3 to be 920, which figure 
must be doubled for insertion in Table 5 so as to be comparable with 


TABLE 5 
CALCULATION OF VM Tasie ADJUSTED FOR PARTIAL CONFOUNDING 
Contrast and 
magnitude (i) (ii) (iii) (iv) +64 Treatment 
ABCD [2842] 4827 —10653 —15068 60636 947 Vil, 
BCD 1985 =—15480 — 4415 75704 51934 811 V3M, 
ACD — 4433 3409 1104 — 9466 35060 548 VoM, 
CD — 11047 — 7824 74600 61400 42786 669 ViMs 
ABD 1569 1272 — 7471 —31540 79734 1246 VuM; 
BD [1840] — 168 — 1995 66600 54852 857 V3M; 
AD — 2799 3280 6012 — 8294 78554 1227 VM; 
D — 5025 71320 55388 51080 49116 767 ViM; 
ABC — 219 — 857 —20307 6238 90772 1418 VuM, 
BC 1491 — 6614 —11233 73496 70866 1107 V;M, 
AC — 2235 271 — 1440 5476 98140 1533 V2Me . 
C. 2067 — 2266 68040 49376 59374 928 ViM, 
AB — 85 110-5 iar 9074 67258 1051 Val, 
B 3365 4302 — 2537 69480 43900 686 V3My 
A 9691 3450 2592 3220 60406 944 VM, 
Sum 61629 51938 48488 45896 42676 667 Vil, 


other contrasts based on 64 plots. The value for ABCD is similarly 
modified. The additions and subtractions then automatically lead 
to the adjusted mean yields in the last column of Table 5. This process 
is here seen to particular advantage, since it removes all need for the 
worry about the signs of adjustments to means that occurs in the more 
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usual process of direct calculation of averages from yields of individual 
plots. 


The average variance 
In Table 6, these means have been rearranged in an interaction 


table of conventional pattern. At this point, a direct check that the 
interaction components BD and ABCD have the values 58 and 89 


TABLE 6 


Mean YIevps ror CLiones AND Muuicues, ADJUSTED FOR CONFOUNDING 
(Ib. dry cacao per acre) 


Mulch ICS clone Mean (+84) 
1 6 8 60 
Bagasse 667 944 686 1051 837 
Cut bush 928 1533 1107 1418 1246 
Trenching 767 1227 857 1246 1024 
None 669 548 811 947 744 
Mean (+84) 758 1063 865 1166 963 


Average S.E. for body of table: +186 lb. per acre 


respectively, as computed on a single plot basis from the totals in §3, 
is useful. The standard errors of the margins of this table are as in 
Tables 3 and 4. For components of interaction, the standard error 
depends upon what combination of unconfounded and partially con- 
founded contrasts is involved. A rough idea of precision can be ob- 
tained by assigning to entries in the body of the table an average 
standard error. This is obtained by noting that the interaction has 9 
degrees of freedom, of which 7 are unconfounded and 2 are unconfounded 
only in one replicate; hence an average variance is 


E x (<) +2~x (5) +9 = 34421 = (186)’, 


where s’ is the intra-block error from Table 2. 

This average standard error can be used in a rough test of any 
component of interaction. For example, inspection of Table 6 suggests 
that one feature of the difference between the two better clones is that 
V, responds well to bagasse whereas V, does not. In the absence of other 
reasons for subjecting this point to special examination, to single it out 
for separate test might introduce a bias into the test of significance. 
That is not to say that to make such an examination is wrong, for a 


STATISTICAL ANALYSIS 309 


tentative inquiry into a contrast suggested solely by the internal 
evidence of an experiment may be the beginning of discoveries about 
previously unsuspected effects. Here this contrast is chosen only for 
illustrative purposes, and with the warning that any apparent statistical 
significance disclosed would need to be accepted with reservations on 
account of its arbitrary selection as the most notable feature of Table 6. 

Define now the symbol V;M; to represent the mean yield of clone 7 
on treatment 7, one of the quantities tabulated in the body of Table 6. 
The comparison that is of interest, denoted by H, is 


H 3(V.M, — VM, — ViM, + V.M,) 
= 146. 


From the average variance of the mean yields, an approximate variance 
obtained immediately is 


. VA) = (})’ X 4 X 34421 = (186). 


Since the standard error is even larger than H, there is clearly no 
suggestion of statistical significance for the contrast. 


The exact variance 


Had the issue of this test been in any doubt, reliance on an average 
variance might have been undesirable. Calculation of an unbiased 
estimate of variance appropriate to H requires that H be expressed as 
a linear function of the nine interaction contrasts AC, BC, ABC, --- 
ABCD used in the earlier analysis. As will be seen below, the difference 
between the two variances is negligible, and for all practical purposes 
the average variance is good enough. Details of the exact calculation 
are presented as an example of a process that is tedious though not 


inherently difficult. ; 
When expressed in units of a single plot, the interaction contrast 


BC can be written formally as 7 
BC = 4(—V;i iax V; aS V3 oe V.)(-N; =e M, =e M; -+ M,), 


~ where the contents of the brackets have been taken from the definitions 
of B and C in §2 and the symbols on the right are meaningful only 
after the product has been evaluated according to ordinary algebraic 
rules. The numerical value of this contrast can be taken from column 
(iv) of Table 1, now with a divisor of 32 to reduce the quantity to units 
of a single plot difference, and is therefore 1491 + 32. Similar expres- 
sions can be written for the other eight contrasts, but of course for BD 
and ABD the numerical values must be taken from one replicate only, 
so as to use only the intra-block estimate as in §2. 
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The unique representation of H as a sum of multiples of these nine 
contrasts can now be discovered by a process analogous to the computa- 
tion of regression coefficients. The justification lies in the mutual 
orthogonality of the nine. The multiplier of any contrast is the sum 
of products of the coefficients of V,M; in H and in the contrast, divided 
by the sum of squares of these coefficients in the contrast. Since the 
formula for H given above involves only four of the V,M; , only these 
need special attention in the contrast. For example, the formal equation 
for BC can be re-written, by expansion of its right hand side, as 


BOl= a(V2M, a VM, =) Vt V.M.+::: etc). 
The rule then gives for the multiplier of BC 
[D@® + -2(-H + (-2(-D + @@] + [@”* X 16] = 1, 


since the full expansion of BC has 16 terms. Similarly for AC the 
multiplier is 


[(3)(—8) + (-2@) + (-2(-d + @@] + [@’ X 16] = 0. 
Completion of these calculations* leads to 
H = BC + ABC + BD + ABD. 


As a numerical check, values of the contrasts may be inserted; reading 
from Table 1, or here more conveniently from Table 5, 


H = (1491 — 219 + 1840 + 1569)/32 
= 146 


as before. 


Now AC, on a single plot basis, is a difference between two means 
of 32 plots, and therefore has variance s’/16. Six of the other inter- 
action contrasts have the same variance, but BD and ABCD have 
only half this precision. Hence the variance of H is 


VH) = aes’ + ves + 48° + des" 
= 5s°/16 
= 35204 = (188)’, 


almost exactly the same as from the average variance. 


*The arithmetic can be completed almost more rapidly than it can be described. There are of 
course, many other ways of obtaining the expression for H, and in this rather simple example doubtless 


a ee way can be found. The rule proposed has the merit of generality and can be applied auto- 
matically. 
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A second example 


As a slightly more complicated example, consider the comparison 
of the difference between the average treatment responses for V, and V,. 
In the same notation, and on a single plot basis, this may be written 
H = §(V.M, + V.M, + V.M; — 3V.M, 


— V.M, — V.M, — V.M; + 3V.M,) 


I 


= 198, 
with approximate variance 
V(H) = (3)? X 24 X 34421 = (151). 
The multiplier of BC in H is, by the rule, 
Ke) + @)C—8) + @)G) + (—2)(—) + (-O-D + CO@) 
+ (—8)(—8) + (@)@)] + [@)’ X 16] = 3. 


This method leads to 
2(BC + ABC + BD + ABD + BCD + ABCD) 
198 as before. 


H 


I 


The unbiased estimate of variance is therefore 
V(H) = (4 X yes’ + 2 X 98°) 
= 2s"/9 
(158)”. 


The difference from the approximate value is rather greater than before, 
but the two agree in showing H to be only about 1.3 times its standard 


error. 


General conclusions 


Although the sum of squares for VM in Table 2 is large enough 
to contain one component significantly greater than the intra-block 
error, Table 8 does not suggest that any interaction has occurred. So 
far as the 1954-55 yields are concerned, the effects of mulching treat- 
ments show no strong evidence of varying from clone to clone. Table 2 
gives no reason to suppose that the three-factor interaction, SVM, is 
worth more detailed examination, but it could be studied in a similar 


manner if required. 
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5. THE COLUMN CONSTRAINTS 


Non-orthogonality of columns 


The analysis so far has ignored the history of the experiment and 
the earlier attempt to impose restrictions on the columns. As men- 
tioned in §2, this has resulted in a constraint that each clone appears 
twice in each of the eight columns of plots. If there are any consistent 
differences between columns in respect of yielding capacity, the error 
variance will have been overestimated because it includes positional 
effects. The simple adjustment of removing a sum of squares of devia- 
tions of column totals is not legitimate because column differences are 
not orthogonal with mulching treatments or with the various inter- 
actions. 


Multiple regression technique 


In such a situation, the general procedure of “fitting constants” for 
the parameters that relate to the extra constraints and solving the 
normal equations can conveniently be replaced by a multiple regression 
and covariance analysis; the two are essentially the same, but the regres- 
sion calculations may fit into a more familiar routine. Outhwaite and 
Rutherford [1955] have described this type of analysis, basing it upon 
the fitting of orthogonal polynomials to the trend of the columns. If_ 
the complete elimination of column differences is planned (and not 
merely the fitting of a linear or quadratic trend), and no particular 
interest attaches to the polynomials of successively higher degree, the 
coefficients of any set of contrasts between columns can be used as 
the independent variates, provided only that no one of the contrasts 
is determinable as a combination of the others. In practice, a set of 
mutually orthogonal contrasts is the most convenient, as it keeps the 
covariance terms small in the next stage of the analysis, and the greatest 
arithmetical simplicity is achieved by using a set in which all coeffi- 
cients are 1,0, or —1. Therefore a suitable set of independent variates 


iS XY; , TZ, °**: , % , where each z is constant within a column and the 
values are:— 
Column :— 1 2 3 4 ) 6 tt 8 
sa ak is 0 0 0 0 0 0 
% = OO 0. —1 1 0 0 0 0 
% =-l —1 il i 0 0 0 0 
Cie OS 0 0 Oss el 0 0 
%& = O 0 0 0 0 Ve ih il 
% = O 0 0 0 -1 -1 Ly 1 
% =-1 -1 -1 -1 1 1 1 1 
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A covariance analysis of y on 2, , t2 , «+: , 27 can now be made 
according to the familiar rules. The first step is the examination of the 
error regression. This requires the error sums of squares and products 
of the z; that correspond with the intra-block error sum of squares 
(26 d.f.) in Table 2. The process of Table 1 has been followed exactly 
to give columns corresponding to the “difference between replicates” 
section of that table for each x; . Sums of squares and products of the 
appropriate 26 totals have been formed and divided by 64; the results 
are recorded in Table 7. 


TABLE 7 
ERROR SuMS OF SQUARES AND Propucts (26 D.F.) FOR 11, %2, °** 27 
Each Entry Requires to be Divided by 8 


ZT 2 x3 r4 X5 v6 X7 
1 56 8 0 8 —20 —16 0 
Le 87 —] —12 13 — 7 2 
3 107 —10 7 9 74 
Xs a 13 —A4 —12 
Xs 66 = @) 18 
Ze 127 14 
27 300 


Next is formed the inverse matrix of the entries in Table 7, sometimes 
termed the set of Gauss multipliers. This calculation will be familiar 
to all who have worked with multiple regression, and is well described 
in many text books (e.g. Fisher [1950], §29; Goulden [1952], §§8.6 and 
8.7). This inverse matrix is shown in Table 8, the entries in it being 


TABLE 8 
InveERSE Matrrx OBTAINED FROM TABLE 7 


hae Zo x3 v4 2X5 Hi v7 
zi 0.189115 —0.032531 —0.007806 —0.041562 0.077960 0.027405 —0.005477 
Z2 0.104492 0.005174 0.029076 —0.037190 —0.000585 0.001449 
X3 0.091870 0.012727 —0.010533 —07005205 —0.021312 
La 0.133459 —0.047930 —0.004289 0.005081 
Xe 0.168207 0.020042 —0.010099 
Xe 0.068428 —0.003280 
X7 0.032876 
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determined by the conditions that the sum of products of values in any 
pair of corresponding rows of Tables 7 and 8 is unity and the sum of 
products for any other pair of rows is zero. Tables 7 and 8 depend 
upon the design of the experiment but not on the particular set of yields 
to be analysed; hence, having once been calculated, they can be used 
in analyses of yields or other observations relating to any subsequent 
years or combination of years. 

The error sums of products for x; with y in the analysis of covariance 
are obtained in the obvious manner: each of the appropriate 26 contrasts 
in column (vi) of Table 1 is multiplied by the corresponding quantity 
for x; , and the sum of these products is divided by 64. The results are:— 


A es = 191.75, 
ay 2 2112.69, 
vs 534.69, 
ti: —1438.44, 
ts: 988.25, 
Cet 307.06, 
Xz : 2555.38. 


The error regression coefficients of y on the xz; are then formed as the 
sum of the products of these amounts with each row of Table 8 in turn, 
which operations give 


bi = 26.5326, 
bo = 151.0361, 
bs = —120.9422, 
bs = —162.6831, 
bs = 119.2241, 
bs = 33.8960, 
br = 81.7262. 


The sum of squares for column differences is the sum of products of 
the b; with the error sums of products of x; , y: 


26.5326 X (—191.75) + --+ + 81.7262 & 2555.38 = 943,793, 


from which follows the analysis in Table 9. The ratio of mean squares 
is not statistically significant, and in fact there is little indication that 
the columns really differ in their yielding capacities independently of 
the treatments applied. Rough examination of yields for each of two 
previous years also showed practically no sign of column effects, but 
there remains a possibility that positional differences may develop more 
strongly as the experiment continues. 
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TABLE 9 
ANALYSIS OF Error Sum or Squares In TABLE 2 
FOR Test or CotuMN DirreRENCES 


Variation due to ete Sum of squares Mean square 
Columns aye.) 943,793 134, 828 
Residual 19 1,985,155 104,482 

Total 26 2,928,948 


Adjustment of mean yields 


The column differences are not sufficiently marked to affect greatly 
the precision of the experiment, and no full adjustment for them will 
be made here. That is not to say that they should be ignored for the 
future; when the experiment has continued for several years and an 
analysis of total yields for a period is made, elimination of column 
effects may become important to the precision. Indeed, since the con- 
straints on columns were deliberately introduced into the design, no 
analysis should be considered strictly correct without this elimination, 
although Table 9 makes clear that conclusions about the 1954-55 yields 
will be unaffected by it. Except for the very much more laborious 
calculations, the situation is the same as that of a Latin square analysis 
in which the mean square for columns is little different from that for 
error. Here, as an example, it will suffice to deal with main effects of 
V and M, for the column effects are almost certainly too small to affect 
seriously the conclusion that the VM interaction is at present unim- 
portant. The adjustments required are made exactly as in an ordinary 
covariance analysis. 

The design has secured that each clone occurs twice in each column, 
and therefore differences between clones in respect of any of the 2; 
are zero. Hence no adjustments to the mean yields of the four clones 
are required, and the only change is that their standard error should be 
computed from the residual mean square in Table 9; the standard 
error for clone means shown in Tables 3 and 6 should be changed from 
+84 lb. per acre to +81 lb. per acre. 

For mulch treatments, the process is more complicated. Perhaps 
the easiest form of calculation is to tabulate total contrasts for C, D, 
and CD in each of the x; , exactly analogous to those for yield in the 
first column of totals in Table 1. If a complete adjustment of all 
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results in respect of column differences were contemplated, construction 
of seven tables on the pattern of Table 1 would probably be the most 
convenient start. Here the three contrasts required have been picked 
out more directly and summarized in Table 10. Each yield contrast 


TABLE 10 
ADJUSTMENT OF CONTRASTS FOR Factor M 


Variate Yield : 
Contrast Coe hee ap Ger Xs Xe xz; | Unadjusted Adjusted 
(6 QO -2 2 2 —4 —-6 —4 2067 3944 
D —4 4 -4 2 -2 -8 8 — 5025 — 5826 
CD —4 2 6 4 2 —6 -—12 —11047 —8921 


is then adjusted to an estimate of what it would have been if the con- 
trasts in x; had been zero (i.e. perfect balance over columns) by sub- 
tracting the sum of products of each 6; with the corresponding 7; 
contrast. Thus for C the adjusted value is 


From the adjusted contrasts, mean yields for the four treatments 
(Table 12) have been constructed as described in §4. 


Average variances 


The variance of a difference between any pair of these adjusted means 
depends upon which pair is to be examined, and in particular upon the 
corresponding differences in respect of the x; , as is usual in a covariance 
analysis. However, for most purposes it is sufficient to ascribe an aver- 
age variance or standard error to the means, by use of the formula 
given on p. 436 of Outhwaite and Rutherford [1955]. This involves 
forming from Table 10 the sums of squares and products for factor M 
(3 d.f.) in respect of the x; , which are shown in Table 11; thus for x, 
and x, the entry is 


[((-2)X¥ 2+4-x (—4) +2 x 6] + 64 = —0.1250. 


The sum of products of corresponding entries in Tables 8 and 11 is 
0.8064 (remembering that each table requires to be completed by a 
bottom left corner symmetrical with the top right), which must be 
divided by the number of degrees of freedom for M and added to unity 
to give a variance adjustment factor of 1.2688. The variance of a mean 


i i Bee eee ee 
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TABLE 11 
Sums or Squares AND Propucts ror M (3 p.¥.) FoR 2 uta gf ar 
T T2 X3 v4 ce L6 X7 
x, 0.5000 —0.3750 —0.1250 —0.3750 0 0.8750 0.2500 
Xe 0.3750 —0.1250 0.1875 0.0625 —0.5000 0.2500 
X3 0.8750 0.3125 0.1875  —0.2500 —1.7500 
X4 0.3750 —0.0625 - —0.8125 —0.6250 
Xs 0.3750 0.4375 —0.3750 
Xs 2.1250 0.5000 
Lr 3.5000 


yield for M, instead of being simply the residual mean square in Table 9 
divided by 16, is this quantity multiplied by 1.2688: 


104,482 


— 2 
i X 1.2688 = (91)’. 


Thus the average standard error for comparisons amongst adjusted 
means is slightly greater than the approximate value in Tables 4 and 6 
that neglected to allow for column constraints. The increase is the 
price that must be paid for the non-orthogonality of the design, which 
here fails to be fully compensated by the reduction in the error mean 
square of Table 9 as compared with that of Table 2. Table 12 sum- 


TABLE 12 


Mean YieLps ror Mutcues, ADJUSTED FoR COLUMNS 
(lb. dry cacao per acre) 


Bagasse Cut bush Trenching None Mean 


853 1255 950 794 963 


S.E.: +91 lb. per acre 


marizes the information on mulches, and shows that adjustment for 
columns has brought out more clearly the difference between cut bush 
and trenched material. 


6. SUMMARY 


A factorial experiment on cacao now in progress in Trinidad pre- 
sents several interesting features as an example of the design and 
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analysis of field experiments. Originally conceived as a plaid square 
for a 2° X 4 set of treatments, circumstances forced a change to partial 
confounding in randomized blocks for a 2 X 4’ set. The change was 
made after certain factors had been introduced, and therefore some 
constraints from the earlier design remained. 

Yields from the experiment in 1954-55 are used in an illustration 
of the systematic analysis and summarizing needed for the partially 
confounded design, neglecting the additional constraints. The con- 
struction of tables of mean yields and the calculation of standard errors 
appropriate to various comparisons are emphasized. A multiple regres- 
sion technique is then used in order to take account of the other con- 
straints. 
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Note added in proof 


The analysis of the yields of this experiment for the 1955-56 crop 
season reveals a column effect significant at the p .001 level. This 
abrupt change in the level of significance is explained by the consider- 
able fall in yield on good plots in columns 1 and 2 and the maintenance 


of good yields elsewhere in the experimental area, particularly in columns 
6, 7 and 8. 


SIMPLIFIED ANALYSIS OF SINGLY LINKED BLOCKS 


K. R. Narr 
Forest Research Institute, Dehra Dun, U. P., India, and 
Institute of Statistics, The Consolidated University of North Carolina, North Carolina, 
U.S.A. 


INTRODUCTION 


The dual of a balanced incomplete block design having the param- 
eters v*, k*, r*, b*, \* can, in the language of Youden [1951], be called 
“*-linked blocks” since every pair of blocks of the dual design will 
have \* treatments in common. When i* = 1, the dual is called Singly 
Linked Blocks. Shrikhande [1952] and Roy [1954] independently 
showed that it is a p.b.i.b. design with two associate classes having the 
following values for the various parameters: 


p= 2th m4); karts ra dy Darth t 1 a0 
A = 1 Ne 0, 
m = r(k — td) ne = (k — r)(r — I)(k — 1)/r 
Ng ae eas (r — 1)(k — v) 
(r — 1)\(k — 1) (r—1(k-n(k-—r—W/r 


rie = | r r(k — r — 1) | 
rik —r — 1) (k — r)? + Ar — 1) — k(k — 1)/r 


Bose and Shimamoto [1952] divided all p.b.i.b. designs having two 
associate classes into five distinct types. Singly Linked Blocks was 
listed as one of these types. More recently, however, Bose, Clatworthy 
and Shrikhande [1954] have recast some of these types with the result 
that singly linked blocks has got submerged under what they call 
“Simple type’. 

In both these papers by Bose ef al, the analysis of p.b.i.b. designs 
having two associate classes has to some extent been simplified with 
the aid of four auxiliary parameters c, ,c., Aand H. But this simplifi- 
cation does not naturally go far enough when a type or sub-type is 
separately considered. For instance, the analysis of Triangular Singly 
Linked Blocks given by Nair [1953] is much simpler than what it would 
have been by the use of the auxiliary parameters ¢, , c, , Aand H devised 
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by Bose ef al. Similarly, the method of analysis of Square Lattices 
given by Cochran and Cox [1950] is the simplest possible for this sub- 
type (see Nair [1952]). 

The purpose of this paper is to give a similar simplified method of 
analysis for Singly Linked Blocks. This method will be applied to the 
illustrative example of a single linked blocks design worked out in 
Chapter III of the bulletin by Bose, Clatworthy and Shrikhande [1954], 
in order to demonstrate how much simpler it is compared to their method. 

This simplified method was developed by the author in the spring 
of 1952 while working at the Institute of Statistics, the Consolidated . 
University of North Carolina, Raleigh during the tenure of a joint 
Fulbright and Smith-Mundt Fellowship awarded to him by the United 
States Government. However, due to lack of a suitable illustrative 
example at the time, its publication was deferred.* 


SIMPLIFIED ANALYSIS 


The general formula (37) for estimating the effect of a treatment 
with recovery of inter-block information, given by Nair’s [1952] paper 
can be simplified to yield the following expression for the adjusted 
treatment total with recovery of inter-block information: 


Ue ar D {Qc} (1) 
where 


= (w — w’)/[bw + (k — 1)w’] (2) 


Q(i;) = Total yield B,,;) of block (7) minus sum of the mean 
yields of the k treatments appearing in that block (8) 


T; = Unadjusted total yield for treatment 7. (4) 


and >>,,, in (1) denotes summation over the r values of Q,;) pertaining 
to the r blocks in which treatment 7 occurs. 
It will be seen that 
Qa) = 1Byy) — DF (T;) (5) 
where > ;;),, denotes summation over the k values of T; pertaining 
~ to the k treatments occurring in the (j)th block. 


Dividing (1) by r we obtain the adjusted mean with recovery of 
inter-block information for treatment 7. 


*The author was aware that the experiment on oats by Dr. Middleton used by Bose et al. for 
illustration was laid out during 1951, The yield data were however not available before he left Raleigh 
in June 1952. 
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To obtain the corresponding intra-block estimates we have only 
to assume w’ = O ory = 1/b. 

For estimating the values of w and w’ from the data we have to 
perform the analysis of variance described below and presented in 
Table 1. 


TABLE 1 
ANALYSIS OF VARIANCE 
Source of Degrees of Sum of squares Mean 
variation freedom square 
wee 
Blocks (adj.) an | b d Oy BE, 
= 
Treatments i= G? 
(unadj.) ie a Fetes 
J ersten vr —b—v+1 (By subtraction) #, 
oF GC 
Total wr — 1 x - 
h=1 or 
b 2 
Blocks 2 2m G 
(unadj.) fare k 2, Bis bk 
Treatment rs tise ily 
St Seaetgrs y—1 pee Vat oT — 7 Be i, 
(adj.) by Ties 


The total sum of squares, the treatment sum of squares (unadjusted 
for block effects) and the block sum of squares (unadjusted for treat- 
ment effects) are calculated in the usual straightforward way. 

It is much easier and simpler in the case of singly linked blocks 
first to calculate the sum of squares for blocks (adjusted for treatment 
effects) than to calculate the sum of squares for treatments (adjusted 
for block effects). The former is given by the elegant expression 


b 
Block $.8. (adjusted) = > Q?,) (6) 


The latter can then be obtained using the well-known relation: 


Treatment 8.8. (Adj.) + Block 8.S. (Unadj.) 
= Treatment 8.8. (Unadj.) + Block 8.8. (adj.) (7) 
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The variance ratio F = EH,/E, provides a test of significance of 
intra-block estimates of treatment effects. 

Nair [1944] had shown that for a non-resolvable incomplete block 
design (such as the singly linked blocks) the estimates of w and w’ 
are given by 


patel ar v(r — 1) (8) 
alive kb — DE, — @ — KE. 
Hence, estimate of » follows as 
ee E, = E, 


If EZ, turns out to be less than Z, it is conventionally assumed that 
w = w’ or » = 0. In such a case the unadjusted treatment means 
T,/r are compared as if the experiment was in randomized complete 
blocks. 

For calculating the lowest significant differences among the adjusted 
treatment means we have to calculate the variance of the difference 
between every pair of them. These pairs fall into groups, namely, 
those pairs which occurred together in the same block and those pairs 
which did not occur together in the same block. 

(a) Variance of the difference between pairs which have occurred 
together in the same block can be obtained by direct substitution in 
formula (89) of Nair’s [1952] paper. It simplifies to the form 


2 
(1+ - Dal (10) 
(b) Variance of the difference between two treatments which have 
not occurred together in the same block can be obtained by substitution | 
in formula (40) of Nair’s [1952] paper. It simplifies to the form 
21 + mn) 
ry Te (11) 


(c) The mean variance of differences for all pairs of treatments is 
derivable from formula (41) of Nair’s [1952] paper and simplifies to 


_ the form 


2 rv — k) 
rw E a @ — 1) | (12) 


By substituting mw = 1/b in (10), (11) and (12) we obtain the cor- 
responding variances for intra-block estimates of treatment effects. 
It is interesting to note that the simplified analysis for Triangular 
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Singly Linked Blocks given by Nair [1953] can be deduced as a special 
case of the above by substituting 


opp —1), k= @-— 1), r=2, beep 


ILLUSTRATIVE EXAMPLE 


The data for this example have been taken from pp. 28-29 of Bose, 
Clatworthy and Shrikhande [1954]. They consist of the yield of 35 
varieties of oats in an experiment conducted in 15 blocks of 7 plots 
each arranged according to the singly linked block design. The values 
of the various parameters of the design are:— 


v = 35, k=7, r= 3, b= 15 
Ai = 1 Ne — 0 
iy — 18 Ne = 16 


Pee = 
8 8 9 6 


Table 2A, giving the field plan, shows the block number (j) in the 
first column. In the second column opposite each block number (and 
in the same row) are given the treatments (7) appearing in this block, 
and below each treatment number 7 is shown the corresponding yield. 
Adding all the yields of all the plots in the block we get B,;,. This is 
the topmost figure in the cell of column (8) relating to each block. 
Summing up the block totals B:;, we get the grand total G of all the 
observations, namely 41763. The grand mean is G/vr = 397.748. 

Table 2B, giving the treatment allocation plan, shows the treatment 
number, 7, in the first column. In the second column opposite each 
treatment number (and in the same row) are given the blocks (7) in 
which this treatment appears, and below each block number (J) is 
shown the corresponding observed yield of the treatment. The top 
figure in each cell of column (3) gives the treatment total T; obtained 
by adding the yields in the second row of the second column opposite 7. 

These values of 7’; are reproduced in the third row of each cell of 
the second column of Table 2A. Sum of the values of 7’; for block (J) 
is denoted by >. :;), (7) and given below the block totals B;;) in column 
(3) of Table 2A. By subtracting this sum from r times B:;, we obtain, 
as stated in equation (5), r times the value of Q,;) given in column (4) 
of Table 2A. 

Thus, for block (1) 

TOG a o0 la Sior 


= +216 


TABLE 2A 
Fretp PLAN 


(1) (2) (3) (4) 
Block Treatments (i), yields and Ti Bij) rQ) (3) 
(i) Leal 
(1) (35) (4) (18) (15) (1) (30) (17) — 
389 502 512 314 442 404 428 2991 +216 
1294 1580 1352 818 1280 1088 1345 8757 
(2) (26) (11) (2) (31) (5) (19) (18) = 
431 482 442 411 456 436 427 3085 + 23 
1229 1494 1345 1117 1536 1159 1352 9232 
(3) (34) (10) (19) (4) (8) (22) (21) — 
334 320 378 498 452 413 272 2667 + 69 
834 990 1159 1580 1391 1085 893 7932 
(4) (18) (9) (21) (3) (7) (25) (33) = 
413 440 306 407 526 212 385 2689 —423 
1352 1399 893 1357 1690 758 1041 8490 
(5) (8) (33) (23) (30) (14) (12) (26) = 
443 286 265 328 417 280 326 2345 —667 
1391 1041 796 1088 1148 1009 1229 7702 
(6) (23) (22) (35) (5) (9) (6) (20) =< 
271 286 455 504 526 515 314 2871 —120 
796 1085 1294 1536 1399 1513 1110 8733 
(7) (20) (16) (2) (33) (4) (28) (18) = 
366 491 556 370 580 284 434 3081 +518 
1110 1613 1345 1041 1580 795 1241 8725 
(8) (6) (21) (28) (12) (31) (15) (29) = 
406 315 285 365 380 260 365 2376 —276 
1513 893 795 1009 1117 818 1259 7404 
(9) (10) (27) (35) (25) (28) (11) (14) = 
325 325 450 292 226 480 328 2426 —200 
990 999 1294 758 795 1494 1148 7478 
(10) (19) (32) (3) (1) (12) (20) (27) = 
345 425 452 400 364 430 322 2738 — 42 
1159 1342 1357 1280 1009 1110 999 8256 
(11) (17) (6) (32) (8) (25) (2) (24) = 
431 592 510 496 254 347 275 2905 +158 
1345 1513 1342 1391 758 1345 863 8557 
(12) (7) (30) (29) (22) (13) (32) (11) = 
642 356 476 386 378. 62407 532 3175 +326 
1690 1088 1259 1085 1241 1342 1494 9199 
(13) (3) (29) (5) (34) (16) (14) (17) — 
498 418 576 265 634 403 486 3280 +748 
1357 1259 1536 834 1613 1148 1345 9092 
(14) (13) (26) (34) (27) (24) (9) (15) = 
431 472 235 352 276 433 244 2443 — 54 
1241 1229 834 999 863 1399 818 7383 
a a eg en 2 re ead ee ee 
(15) (31) (1) (16) (24) (10) (23) (7) SS 
326 438 488 312 345 260 522 2691 —276 
1117 1280 1613 863 990 © 796 1690 8349 


i 


G = 41763, G/or = 397.743 


a 
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TABLE 2B 


TreaTMENT ALLocaTION PLAN, InTRA-BLOCK ESTIMATES, AND 
ComBINeD InTRA- AND INTER-BLocK EstTIMATES 


oe 


(1) (2) (3) (4) Adjusted treatment 
mean with inter-block 
Treat- Blocks (3), Ti (r/b) 2 Q3) information 
ment yield and and and a ee Se ee 
¢ 7Qc3) re M5) rude Qc) (5) (6) 
lu Not 
recovered Recovered 
1 (1) (10) (15) — — 
442 400 438 1280 — 6.800 428 .933 
+216 —42 —276 —102 — 0.951 426 .984 
2 (2) (7) (11) = = 
442 556 347 1345 +46 .600 432 .800 
;. +23 +518 +158 +699 + 6.517 446.161 
; Ss a a a ee es ag ee oes ee es 
7 3 (4) (40) (13) _ os 
' 407 452 498 1357 __ +18 .867 446 .044 
—423 —42 +748— +283 + 2.639 E | 451.454 
4 qa) (3) 1 = — = 
502 498 580 1580 +53 .533- 508.822 
+216 +69 +518 +803 - 7.487 524.171 
5 (2) (6) - (3) | — — 
456 504 576 —— 1536 +43 .400 497 .533 
+23 120, +748 +651 + 6.070 | 509.977 
6 (6) (8) (1) si} = — 
515 406 592 1513 =| —15.867 509 .622 E 
—120 —276 +158 —238 | — 2.219 zs 505.073 


YOO are! 5 Ue eee ee 
526 642 522 
—423 +326 —276 


ee en ree 
42443400 
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TABLE 2B—Continued 
(1) (2) (3) (4) Adjusted treatment 
mean with inter-block 
Treat- Blocks (J), Ti (r/b) DL, Qi) information 
ment yield and and and ST 
i Qi) rd Qs) rH, Q3) (5) (6) 
. aor Not 
recovered Recovered 
138 (7) (12) (14) —_ — 
434 376 431 1241 +52 .667 396.111 
+518 +326 —54 °J +790 ‘ + 7.366 411.211 
14 (5) (9) (13) = = 
417 328 403 1148 — 7.933 385.311 
—667 —200 +748 —119 — 1.110 383 .037 
15 (1) (8) (14) = = : 
314 260 244 818 — 7.600 275.200 
+216 —276 —54 —114 | — 1.063 273.021 
16 (7) (13) (15) = —_ 
491 634 488 1613 +66.000 515.667 
+518 +748 —276 +990 + 9.231 534.590 — ’ 
17 (1) (11) (13) — — Tor 
428 431 486 1345 +74 .800 423 .400 
+216 +158 +748 +1122 | +10.461 | 444.846 
18 (1) (2). (4) = ae 
512 427 413 1352 —12.267 454.756 
+216 +23 —423 —184 — 1.716 | 451.239 
= = = = = - = ———————— | — — |- — — — | — _ ‘ 
19 (2) (3) FO}. |* = —- | | : 
436 378 345 | — 1159 | + 3.383 | 385.222 
+23 . +69 —42 | = One| | 
(6) (7) (10) | 
314 366, 430 | 


| =t0 +518 


-2 | + 
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TABLE 2B—Continued 
QQ) (2) (3) (4) Adjusted treatment 
- mean with inter-block 
Treat- Blocks (j), Ti (r/b) >> (i) information 
ment yield and and and 
: rQ(3) rd Qu) rad, Qc4y (5) (6) 
ale Not 
recovered Recovered 
26 (2) (5) (14) — -= 
431 326 472 1229 —46 .533 425.178 
+23 —667 —54 —698 — 6.508 411.836 
27 (9) (10) (14) = — 
325 322 352 999 —19.733 339.578 
—200 —42 —54 —296 — 2.760 333 .920 
28 (7) (8) (9) — —_ 
284 285 226 795 + 2.800 264 .067 
+518 —276 —200 +42 + 0.392 264.869 
29 (8) (12) (13) — == 
365 476 418 1259 +53 .200 401.933 
—276 +326 +748 +798 + 7.440 417.187 
30 (1) (5) (12) = = 
404 328 356 1088 — 8.333 365.444 
+216 —667 +326 —125 — 1.165 363 .055 
31 (2) (8) (15) = = 
411 380 326 it bie —35.267 384.089 
+23 —276 —276 —529 — 4.932 373.977 
32 (10) (11) (12) — —_— 
425 510 407 1342 +29 .467 437.511 
—42 +158 +326 +442 + 4.121 445.960 
33 (4) (5) (7) Coe =e 
385 286 370 1041 —38 .133 359.711 
—423 —667 +518 —572 — 5.333 348.778 
34 (3) (13) (14) _ —- 
334 265 235 834 +50 .867 261.044 
+69 +748 —54 +763 + 7.114 275 .629 
35 (1) (6) (9) — aaa 
389 455 450 1294 — 6.933 433 .644 
+216 —120 —200 —104 — 0.970 431.657 


The values of rQ,;) are reproduced in the third row of the cells of 
column (2) of Table 2B, below each block yield and against each treat- 
ment number, 7. Their sums, denoted by r >... Qu) are given below 
the values of 7; in column (3). By dividing these sums by 6 (in this 
case 15) we obtain the corrections, given in the upper entries in cells 
of column (4), to be applied to 7’; for obtaining the adjusted treatment 
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totals without recovery of inter-block information given in column (5). 
To obtain the adjusted treatment total with recovery of inter-block 
information, we have to multiply the second figure given against each 
treatment 7 in column (3) of Table 2B by the estimated value of » and 
subtract this quantity from T; . 
To obtain the estimate of u, we have to complete the analysis of 
variance given in Table 3. 


TABLE 3 
ANALYSIS OF VARIANCE 
Degrees 
Source of variation of Sum of squares Mean square 
freedom 
Blocks (adjusted) 14 41 028.089 2930.578 (A) 
Treatments (unadj.) 34 753, 254.057 
Intra-block error. 56 141,159.911 2020) (fo GH.) 
Total 104 935, 442.057 
Blocks (unadjusted) 14 182,839 .200 
Treatments (adjusted) 34 611, 442.946 17983.616 (F:) 


The total sum of squares, the block (unadjusted) and the treatment 
(unadjusted) sums of squares are calculated in the usual straightforward 
way. 

The block (adjusted) sum of squares is easily calculated by squaring 
the values given in the last column of Table 2A and using formula (6) 
in a slightly modified form, namely, 


py PQu! = G5 (216)* + (23) + ++ + 276)"] 


= 41028.089 


The intra-block error sum of squares and the treatment (adjusted) 
sum of squares follow by subtraction. 

It will be noticed that the Block (adjusted) sum of squares and 
Treatment (adjusted) sum of squares given in Table 3 closely agree 
with those given in Tables 3.3 and 3.2 respectively on pages 38 
and 36 of Bose, Clatworthy and Shrikhande [1954]. The slightly bigger 
disagreement in the case of the former is due to a computational mistake 
in the treatment (unadjusted) sum of squares in their Table 3.3. 
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The intra-block variance ratio F = E,/E, = 7.134 is significant at 
the 0.1 per cent level. 
Estimate of yu is 


= a = 0.00932387. 

Each treatment total 7’; given in column (3) of Table 2B may now 
be adjusted by subtracting from it either 1/b or u times the value of 
r >> Q;) given just below the value of 7; , according as the adjustment 
desired is without or with recovery of inter-block information. The 
corresponding adjusted treatment means are obtained by dividing 
those totals by 3. They are given in columns (5) and (6) of Table 2B. 

(a) The variance of a difference between means, adjusted with 
recovery of inter-block information, of two treatments occurring in 
the same block is, using formula (10), 


= [1 + © — lu] = 2(1.0186477 x 2520.713)/3 


1711.812 


(b) The variance of a difference between means, similarly adjusted, 
of two treatments not occurring together in the same block is, using 
formula (11), 


2 (1 + ru) = 2(1.0279716 x 2520.713)/3 


1727.481. 


(c) The mean variance of differences for all pairs of adjusted treat- 
ment means is, using formula (12), 


2 ry — k) “| a 2 ( 42 ) 
2/1 4m=2 1) As he bie 
= 1719.186. 


To obtain values of the corresponding variances for differences 
between treatment means adjusted without recovery of inter-block 
information, we have only to replace » by 1/b = 1/15 in (a), (b) and (c). 
The values of these variances are 1904.539, 2016.570 and 1957.260 


respectively. 
The l.s.d. values may now be calculated in the usual way. 
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Experiments on consumer preference have taken many forms. In 
that known as “paired comparisons”, the several treatments or stimuli 
are compared in pairs, each treatment appearing with every other 
treatment in the same pair. It is an especially appropriate design 
for testing the effect upon a food of chemically different pesticides. 
These may produce qualitative differences in flavor which the subject 
finds easier to compare in pairs. Usually he is asked to report a simple 
preference for one of the two samples in each pair. Scheffé [1952] has 
extended this procedure by asking each subject to note the degree of 
his preference within each pair as well as its direction. 

A recent experiment of this type concerned the relative palatability 
of apples sprayed with two different insecticides and with two different 
fungicides in a 2 X 2 factorial design. In its analysis, several statistical 
methods were to be compared in respect to their sensitivity, consistency 
and computational requirements. Among them were the Mosteller 
[195la,b] modification of the original proposal by Thurstone, the 
Bradley-Terry [1952a,b; 1953] analysis, and the Scheffé [1952] tech- 
nique. In the course of the study, a new approach was developed which 
shows particular promise. It is based upon the mean normal deviate, 
tabled by Fisher and Yates [1953] and called the “rankit’”’ by Ipsen 
and Jerne [1944]. It answered several questions which we could not 
otherwise resolve so readily, if at all. We will describe this rankit 
analysis in its present application and compare it with alternative 
procedures. 

The experiment had several objectives. One was to arrange the 
treatments on a linear scale, spaced so as to reflect the average degree 
of preference expressed by the tasters. Criteria were needed for judging 
the significance on this scale of effects associated with factorial combi- 
nations of the individual treatments. The scale had to be validated by a 
test of its additivity or subtractivity. If, for example, B was preferred to 
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A and C to B, would C be preferred to A? Did the replicates of the 
experiment agree? 

Also, there was no assurance a priori that the tasters would agree. 
Some might judge more consistently than others in replicated tests. 
Or, though consistent, they might differ in their response to specific 
factorial comparisons. Both kinds of discrepancy could be important 
to the experimenter. The order of presentation of the two treatments 
comprising a pair, or of a given pair in a sequence of pairs may have 
been a factor. The samples tasted first, for example, might modify the 
subject’s reaction to the following samples, even though the pesticide 
affected flavor only indirectly, by influencing the content of sugars, 
acids or other constituents in the fruit (Garman ef al. [1953]). Informa- 
tion regarding these possibilities was therefore sought. Finally, an 
analysis with good discriminatory power which required a minimum 
of specialized tables or of involved statistical calculations would be 
preferred. The proposed rankit analyses will be examined with these 
objectives in mind. 


EXPERIMENTAL PROCEDURE 


Samples and their preparation 


The test samples were randomly-selected Cortland apples from the 
fall harvest in an experimental orchard of The Connecticut Agricultural 
Experiment Station at New Haven in 1951 (Garman ef al. [1953]). 
They had been subjected to four different spray treatments involving 
two insecticides, lead arsenate and parathion, and two fungicides, 
thiram and sulfur, ina 2 X 2 factorial design. Earlier work (Greenwood 
et al, [1951]) had demonstrated that unsweetened apple sauce was a 
better medium than raw apples for the detection of flavor differences. 
Not only is sauce more homogeneous, but it can be made more repre- 
sentative by preparing each sample from portions of several apples. 
In order to avoid any changes which might take place in raw apples 
during storage, sauce sufficient for all of the taste tests was made on a 
single day. 

In preparing the sauce, ten apples from each treatment were selected 
for each test day. After these had been washed in detergent suds, 
rinsed and wiped, they were quartered, cored, and weighed in 300 gram 
portions. Hach such portion required about one-fourth of each apple 
and made a sample of sauce sufficient for 30 to 35 tasters. After cutting 
into 16th’s, the apple segments comprising each sample were added 
to 150 grams of boiling tap water, covered, and boiled gently for 15 
minutes, with one stirring after 10 minutes. After cooking, each sample 
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was weighed, sieved through a Foley mill with 50 turns of the handle, 
cooled, put into a standard waxed container, and frozen. Before each 
day’s taste test, the required samples of apple sauce were thawed and 
allowed to come to room temperature. 


Test procedure 


On each test day, samples representing the four different combina- 
tions of spray components were offered to the tasters in six pairs, each 
treatment with every other treatment. The sample pairs were identified 
by Roman numerals and tasted in the order indicated on the score 
sheet (Table 1). For each pair, the taster was asked: (1) to give his 


TABLE 1 
INDIVIDUAL ScoRE SHEET FOR Eacu Tasts TEst 
Instructions: Do not discuss your reaction with other tasters. Compare sample A 
with B in each pair and state preference and degree of preference. (Always make a 
choice.) The first time taste sample A and then sample B, but you may go back 
and re-taste in any order. Start with Pair I and work toward the last pair. 


Pair I IT DET IV V VI 


I prefer (A or B) 


The difference is: 
(check one) slight 


moderate 


large 


no difference 


Comments: 
University of Connecticut Date: 
School of Home Economics Test No: 


preference for sample A or B, in every case making a choice between 
them, and (2) to check whether he considered the difference slight, 
moderate, large or, if his initial choice had been arbitrary, non-existent. 
The sample pairs which corresponded to the Roman numerals on a 
given score sheet were determined for each taster by the sequence ina 
row of the 6 X 6 Latin square in Table 2. The treatment designations 
beneath Table 2 follow the customary nomenclature for a 2 X 2 factorial 
design. Only one insecticide, lead arsenate (ZL), and one fungicide, 
thiram (Th), are represented by letters; when either was absent, it was 
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TABLE 2 
LATIN SQUARE FOR ASSIGNING PAIRS OF SAMPLES 
(Designated on the score sheets as A and B respectively) to successive subjects on 
the first, third and fifth test day, to insure that all orders of tasting were represented 
oneach occasion. On the second, fourth and sixth test day, every subject received the 
same sequence of sample pairs as on the preceding test day, but in the reverse order. 


Order of tasting pairs in sequence 
Sequence 

I Il Ill IV V VI 
1 LTh:Th L:Th (—):Th LTh:L (—):LTh L:(—) 
2 LTh  LThTh Lii—) (—):LTh (-):Th  LTh:L 
3 Colh 15) yr TIN RINE SEIU ego, (—):LTh 
4 LTh:L (=) ihe —) eh L:Th L:(—) LTh:Th 
5 (—):LTh LTh:L LTh:Th GX) Phe) le 
6 Li(—) (—):LTh LTh:L (=):Th LTh:Th L:Th 


Spray TREATMENTS 


LTh = lead arsenate and thiram 
L = lead arsenate and sulfur 
Th = parathion and thiram 


(—) = parathion and sulfur 


replaced in the spray mixture by its alternate, parathion or sulfur. 

The tasters, including both men and women, were students and staff 
members at the University of Connecticut. On any one day they 
numbered from 31 to 34, but of these only 25 were present on all six 
test days. The present paper is based on the data from these 25 tasters. 
Since they were not asked to assess an attribute but merely to state 
an overall, subjective preference, they were given no preliminary 
training. Some, however, had participated in other tests of a similar 
nature. 


Sample sequence 


A primary objective of the Latin square in Table 2 was to insure 
that every pair of samples would occur in each position in the sequence 
of six pairs. The six replicate tests were arranged in three pairs. On 
the first test day, the tasters were assigned in rotation on order of arrival 
to the successive rows in the Latin square. Two days later, the test 
was repeated and each subject received his pairs of samples in exactly 
the same sequence as in the preceding test, but with the order within 
each pair reversed. A pair consisting of L:Th on the first day, for ex- 
ample, was presented in the order 7h:L on the second day. On the 
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third and again on the fifth test days, the tasters were reassigned to 
the sequences 1 to 6 in the order of their arrival. 

By virtue of this design, the order of tasting within each pair was 
fully balanced and its effect will be tested in later sections. This was 
not true, however, between pairs. Since the 25 tasters did not arrive in 
any planned order, the sample sequences were not balanced among them. 
In consequence, the number of repetitions of each sequence varied from 
20 to 30 and the number of different tasters represented in each sequence 
from 8 to 13, with 10 tasters repeating a sequence and one taster follow- 
ing the same sequence (No. 3) in all six tests. The incomplete balance 
among sequences was a potential source of bias if the preference asso- 
ciated with a given insecticide or fungicide were to increase or diminish 
during the tasting of six paired samples of apple sauce. 

To test this possibility, the percentage preference for lead arsenate 
(L) over parathion was determined for each position (J to VJ) in the 
order of tasting when both were paired with the same fungicide. Similar 
percentage preferences were determined for the thiram (7h) over sulfur, 
when both were paired with the same insecticide. These preferences 
are given separately for each sequence in Table 3. The percentages 


TABLE 3 
DISTRIBUTION OF TASTERS, TRIALS AND PERCENTAGE PREFERENCES 
AMONG THE SEQUENCES IN TABLE 2 


Preferences for lead arsenate (Z) and for thiram (7h) restricted to pairs with the 
same fungicide or insecticide, respectively. 


No. of No. of Percentage preference for L or Th 


Sequence tasters tests in specified pair 
i II Ill IV Vv VI 


1 12 30 53.3* 60.0 73.3 63.3* 
2 11 28 DUO cS oas GO) emt i/ork 

3 8 20 45.0 40.0* 50.0* 65.0 

4 10 22 50.0 40.9 54.5* 63.6* 
5 13 26 61.56. -5010%, 57,7" 65.4 

6 a 24 54.2* 70.8 45.8 58.3* 


*Lead arsenate (L) compared with parathion in a pair with the same fungicide; in the remaining 
percentages, thiram (Th) was compared with sulfur in a pair with the same insecticide. 


within each row represented the same group of tasters and hence were 
comparable by pairs, without confounding their positions in a sequence 
of tests with differences between tasters. Although the preference both 
for lead arsenate and for thiram tended to increase with order of tasting 
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I to VI, the trend within tasters was not significant for either spray 
component (b = 1.93 + 1.14 for lead arsenate and 6 = 1.11 + 1.87 
for thiram). 

The effect of order of tasting was tested more critically by computing 
an analysis of covariance of degree of preference y upon the linear term 
for position in the sequence of pairs. The six successive paired tests 
were assigned covariate values of v = —5, —3, —1, 1, 3 and 5 respec- 
tively for computing sums of squares and products paralleling those of 
the rankits in Table 12. The regression of y upon » in the error (row 14) 
was not significant (F = 0.22) and, when F values were computed from 
the “reduced” mean squares, all of those in the preceding rows were 
substantially unchanged. Since the position of a pair had no demon- 
strable effect in either test upon the resulting preference, differences 
in sequence have been ignored in all later analyses. 


THE RANKIT ANALYSIS OF SIMPLE PREFERENCES 


The rankit analysis of simple preferences follows essentially the same 
model as Mosteller’s [195la] development of the Thurstone approach. 
Each stimulus, represented here by the apple sauce from one of the 
spray treatments, gives rise in the subject to a taste sensation, which 
can be located on a subjective scale. In replicated tests on the same 
subject, the sensations from each stimulus are assumed to be normally 
distributed. During the test, stimuli or samples are presented in pairs, 
each sample giving rise to a sensation. These sensations the subject 
compares and reports one as the greater, no ties being allowed. If the 
paired sensations are correlated, the correlation is assumed to be equal 
for all pairs. Just as the sensations in replicated tests on the same 
subject may be considered as varying normally, the same postulate 
of normality is adopted for the sensations in different subjects. 

Although the stimuli may have no measurable physical character- 
istic, they can be spaced on a linear scale of normal deviates by means 
of the responses. In the present experiment, each of the four treat- 
ments, representing the 2 X 2 factorial combinations of two insecticides 
and two fungicides, occurred in three pairs. The proportion of choices 
in which each was preferred to the other member of the same pair 
could be transformed to a normal deviate and the three deviates for 
each sample averaged, with 4 as the divisor. These averages or scores 
would measure the spacing on the hypothetical preference scale of the 
four separate treatments or stimuli. Mosteller [1951b] has described a 
x’ test for determining whether such a scale can be considered linear. 
One could compute an expected difference in normal deviates for each 
of the six paired comparisons, convert these differences to proportions 
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and the proportions to equivalent angles. The corresponding observed 
proportion for each pair would then be converted to its equivalent 
angle, and the observed and expected angles compared by x’. 

In a replicated experiment, the above process can be simplified 
by means of rankits, tabled as “scores for ordinal (or ranked) data”’ 
by Fisher and Yates [1953]. Each rankit is the expected mean deviate 
of the 1st to Nth item in a ranked sample of N items from a normal 
population which has a mean of 0 and a standard deviation of 1. With 
rankits, the number of times a given treatment is preferred, when 
presented with every other treatment, can be converted directly to 
mean deviates. Non-additivity in the rankits may be tested by the F 
ratio from an analysis of variance, its error term being the interaction 
of non-additivity by replicates. We will assume here that the rankits 
for different stimuli or treatments differ so little, if at all, in their inherent 
precision, that negligible information is lost by weighting all values 
equally. 


Analysis with the replicate as the unit 


In the present experiment, the same 25 tasters judged all six pairs 
of samples on each of six test days. Consequently analyzing the data 
with the test day or replicate as the unit, differences between tasters 


TABLE 4 


FREQUENCY WITH WHICH APPLESAUCE FROM EAcH Spray TREATMENT 
Was PREFERRED IN HacH PArRED COMPARISON BY 25 TASTERS 


In odd-numbered tests, the two samples of each pair were tasted in the order 

shown in Table 2 (order a); in the even-numbered tests this order was reversed 

(order 6). The chi-square values for the consistency of preferences have been 
computed as described by Mosteller {1951b]. 


Frequency of preference for Test of 
Test Date Order each treatment in each paired comparison additivity 
1951 28s 


Th:LTh| Th:L | Thi(—) | LTh:L |\LTh:(—)| L:(—) x? P 


10 15 | 1,422 .71 


1 11/13 a dD P4112 ABH TS as 4 Thy c16 8 8 

2 11/15 b 10015: 27 SSH 15, 10-117 19 6/14 111] 0.918 .86 

3 11/27 a 14 11 | 11 14])13 12) 16 9|17 8 | 13-12 | 3.092 38 

4 11/29 6 9, 165/15) 10a MSS 12<| 169. 48 8 7- |LOSS AS L217 gene 6 

5 12/4 a Jl 14/15 10} 14 11) 16 9)-13 12) 12 13) 0.527 .92 

6 12/6 b 14S ile 149) 2 e629) 2207 5%) 1056 51..6 243 10 
Total ee | 69.81 | Si 600) Shu 69n| 95. 55,103 S47 173: s12e ea 649) oe 


Treatments: LTh = lead arsenate and thiram, L = lead arsenate and sulfur, Th = parathion and 
thiram, (—) = parathion and sulfur. 
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neither biased the ratings nor entered into the analysis. Table 4 
records the number of times that each apple sauce was preferred when 
presented in the pair listed at the top of the table. In tests 1, 3 and 5, 
the two samples forming each pair were tasted in the order shown in 
Table 2 (order a). Two days later, each subject repeated the test 
(Tests 2, 4 and 6) with the same sequence of pairs as in his preceding 
trial, but with the order of the two samples, within each pair, reversed 
(order b). The six replicates, therefore, formed three pairs of tests, 
which determined, in part, the structure of the analysis. 

As a first step, the frequencies of preference in Table 4 were trans- 
formed directly to rankits. With 25 tasters, the number preferring 
the first sample in a given pair could vary from 0 to 25, so that rankits 
were required for a series of 26. Since the actual number of such choices 
in the present experiment ranged only from 9 to 20, the corresponding 
rankits in Table 5 (left) are restricted to the 10th to the 21st item. As 


TABLE 5 


Mean Normau Deviatss (y) oR RANKITS FOR 
SERIES OF 26 AND OF SEVEN ITEMS 


(for transforming the data in Tables 4, 8 and 10) 


For 26 alternatives For 7 alternatives 
Frequency Frequency Frequency Degree 
of Rankit, of Rankit, of of Rankit, 
preference y preference y preference preference y 
9 — .34 15 24 0 3) — loo 
10 — .24 16 34 1 —2 — .76 
11 — 714 17 44 2 —1 = 230 
12 — .05 18 25D 3 0 0 
13 05 19 .67 4 1 .35 
14 14 20 19 5 2 .76 
6 3 1.385 


these cover only the middle of the rankit scale, a direct analysis of the 
number of preferences, without transformation, would here have given 
substantially the same results. The second sample in each pair had 
numerically the same rankit as the first sample, but with the sign 
reversed, so that only the rankits for the first sample were required. 
The transformed values from Table 4 are given in the body of Table 6, 


each referring to the first of the two paired samples listed at the top 
of its column. 
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TABLE 6 
RANKITS FOR THE RepiicaTepD Tests In TABLE 4 AND 
TuHeErR Factrorrat CoMBINATIONS 


Test Date, Order Rankit y for each pairing Sexy for effect of 
1951 in pairs |Th:LTh Th:L Th 3(—) LTA:L LTh:(—) L:(—) Th By Me eg 8 
1 11/13 a —.14 —.05 .05 .14 .34 —.24 .48 .29 47 
2 11/15 b — .24 44 .24 44 .67 .14 1.79 61 .80 
3 11/27 a 14 —.14 05 .34 .44 .05 .69 -49 -10 
oa 11/29 b — .34 .24 .05 34 05 —.24 host ee} -41 .87 
5 12/4 a —.14 .24 .14 .34 .05 —.05 .77 —.10 .39 
6 12/6 b 14 —.14 .05 .34 19 .67 1.04 1.46 —.52 
u | 
Total, T; — .58 .59 .58 1.94 2.84 .33 5.95 3.16 162 
Yue — Ly, Di 2 =) a6) Scola Sepa nea 31 
Coefficients | Th bis ay 1 1 1 1 0 
x for L —1 —1 0 0 1 1 
effect of Th XE —1 —1 1 0 —1 


The mean rankit or score for each of the four samples or treatments 
was determined from the sums (7’;) of the rankits for each pair in which 
it occurred, adding those from the columns in Table 6 where the treat- 
ment was listed first and subtracting those where it appeared second in 
the pair. The total for treatment Th, for example, was —.58 + .59 + 
.58 = .59, and for LTh, L and (—) it was 5.36, —2.20, and —3.75 
respectively. The following mean rankits from these totals, 9 = 
oD y/(4X6), representing the spacing of the four treatments in mean 
normal deviates, closely paralleled those obtained by applying the 


Treatment Th LTh L (—) 
Mean rankit, 7 .0246 2250) 00 Lie ett O02 
Mean deviate, S‘ 0256 2362 | —0967 | —.1651 


Thurstone-Mosteller technique separately to each replicate and averag- 
ing the values of S/. Their relative magnitudes corresponded almost 
exactly (Table 14). 

The treatments in the present series formed a 2 X 2 factorial, 
as indicated by the symbols in Table 2, giving contrasts involving 
thiram ». sulfur, lead arsenate v. parathion, and the influence of each 
of these on the effect of the other, ie., their interaction. These three 
contrasts have been designated as the effect of Th, of L, and of Th X L, 


390 BIOMETRICS, DECEMBER 1956 


respectively. To compute their numerical values directly, the pair 
totals T; were multiplied by the corresponding coefficients x at the 
bottom of Table 6 and the products summed. The totals of 5.95 for 


TABLE 7 
ANALYSIS OF VARIANCE OF THE REPLICATED Test RANKITS IN TABLE 6 
Sum of Mean 
Row Source Dif squares square F 

it Effect of Th 1 1.4751 1.4751 Bos Osa 
2 SP PS, 1 .4161 .4161 9.35* 
3 ee The xt: 1 . 1080 . 1080 2.43 
4 Average non-additivity 3 . 1607 .0536 1.20 
5 Order X Th 1 .1785 .1785 4.01 
6 Le 9 8, 1 . 1350 . 1350 3.03 
7 Se a te ae 1 .0040 .0040 .09 
8 “ X non-additivity 33 .0956 .0319 12 
9 Test X Th 4 .0908 .0227 .53 

10 ee CAL 4 . 2004 .0501 LeA7 

11 Gi) SOM SCTE 4 . 2631 .0658 1.54 

12 “XX non-additivity 2, .5135 .0428 1.00 

13 Total 36 3.6408 

14 Test interactions 24 1.0678 .04449 1.00 


Tie <n OLOts = Pr< 05001. 


the direct effect of Th, of 3.16 for the direct effect of L, and of 1.61 for 
the interaction Th X L represented the three degrees of freedom among 
the four treatments. 

The analysis of variance in Table 7 has been computed directly 
from the data in Table 6. The treatment effects in rows 1 to 3, computed 
as ))” («T,)/4k with 4 = >> x” and k = 6 replicates, accounted for 
three of the six degrees of freedom among the six pair totals T; . The 
other three degrees of freedom measured non-additivity in the series, 
or the failure of our observations to conform exactly to a linear order 
on the scale of rankits. Its sum of squares (row 4) was computed from 
the totals 7’; and the three preceding sums of squares as (>> T7/k) — 
(DUD)? (2T;)/4k). 

Because the two samples within each pair were always presented 
in the a series in the order shown in Table 2, and in the 6 series in the 
reverse order, we could measure the effect of reversing the order of 
tasting upon the tasters’ preferences. The three rankits for order b 


Ee 
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were subtracted from the three rankits for order a to obtain the differ- 
ence D; for each treatment pair (Table 6). Substituting D; for 7; 
in the formulae of the preceding paragraph gave the sums of squares 
in rows 5 to 8 of Table 7. 

Although the remaining variation among the y’s in Table 6 might 
have been pooled at once into an experimental error, its homogeneity 
could be tested by first isolating the “interaction” i.e. discrepancy of 
each factorial effect by replicates, i.e. tests. Their respective sums 
of squares in Table 7 were computed in turn from the last three 
columns of >) zy in Table 6, as (D{(>> 2y)?/4) — (D2 (eT,)/4k) — 
(>-?(xD,)/4k). The sums of squares in rows | to 11 were subtracted 
from the total, >> y’, to obtain the remainder, that for the “interaction” 
of tests and non-additivity. With its mean square (row 12) as a pro- 
visional error, the tests clearly agreed in their rating of the three effects 
of treatment; in consequence, all interactions of tests by treatment 
were combined (row 14) into a single error mean square with 24 degrees 
of freedom. 

The variance ratios F showed a highly significant preference for 
thiram over sulfur and a smaller but still very significant preference for 
lead arsenate over parathion, with no significant interaction between 
the fungicide and the insecticide. Although the response to both direct 
effects may have been influenced in part by the order of presentation, 
neither interaction mean square was significant (P > 0.05). Finally, 
the F values in rows 4 and 8 gave no reason to question a linear relation 
among treatments on the rankit scale, as postulated in setting up the 
analysis. This agreed with the x’ test for non-additivity on the scale 
of normal deviates (Mosteller [1951b]) in the last two columns of Table 4. 


Rankit analysis with the taster as the unit 


Since every subject appeared once in each of the six replicates, the 
data from the 25 subjects could be pooled and the agreement among 
replicates compared. To pool replicates and examine the agreement 
among subjects was equally legitimate. The reactions of each taster 
to each treatment pair were therefore totalled over all six replicates, 
giving for Th:LTh, Th:L etc. for taster 1, for example, frequencies of 
preference 3:3, 4:2, 4:2, 3:3, 6:0, and 1:5, leading over all 25 tasters 
to exactly the same treatment totals as in Table 4. The frequency of 
preference of each taster for the first of the two items in each paired 
comparison was then converted directly to its rankit in a series of seven 
(Table 5, right). The rankits for the above preferences of taster 1, for 
example, were y = 0, .35, .35, 0, 1.35, —.76, respectively, as shown in 
Table\8. 
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TABLE 8 
RANKITS FOR INDIVIDUAL TASTERS 


Rankits, from Table 5 for a series of seven, for the frequency of preferences by each 
taster in specified comparisons. 


Taster Rankit y for the treatment listed first in each paired comparison 
Th:LTh Th:L Th:(—) LTh:L LTh:i(—) L:(—) 


1 0 39 .30 0 1.35 = oe 
2 = fe Sail Sis 1230 0 16 .35 
3 = he 0 .39 —) 30 76 = he 
4 .76 76 0 1.35 0 0 
5 0 0 0 1.35 1.35 0 
6 76 0 0 0 == 665 0 
u 0 76 35 76 30 35 
8 35 = to 0 0 76 35 
9 0 76 .30 0 0 = hs 
10 — .35 0 0 = ot 0 0 
11 == 30h) 35 0 = alt 0 = 380) 
12 = is 0 39 76 1.35 = ai) 
13 0 35 0 othe 76 35 
14 0 = Bas 30 35 35 35 
15 eso 35 = 8) 0 76 == 8B) 
16 0 35 35 1.35 35 35 
17 = 68) =o = ot = 55 0 0 
18 = Wd) 35 = Bs 1.35 76 0 
19 35 35 5D 76 0 76 
20 OO eG 0 0 == ehh) 76 
21 0 .30 76 .76 1.35 76 
22 0 — .35 0 = ie = gui) 35 
23 0 0 35 0 76 ie 10) 
24 35 .30 0 35 0 1.35 
25 — .35 .35 1.35 76 76 =+ 148 
he Saal 1.86 2.16 8.50 11.48 1.23 


The analysis of variance in Table 9 was computed from Table 8. 
For each of the three factorial contrasts, the sums of the rankits (T7,) 
for the 25 tasters were multiplied by the coefficients x at the bottom of 
Table 6, and the contrast sum of squares computed as in Table 7 but 
with k = 25. The residual variation among the 7',’s gave the average 
non-additivity in row 4. To measure the agreement between subjects, 
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>> zy for each treatment factor, Th, L and Th X L, was determined 
for each individual (as in the last three columns of Table 6), leading by 
a similar calculation to the sums of squares in rows 5 to 7 of Table 9. 
The remainder (row 8) measured the “interaction” of tasters and non- 
additivity, which served as the error. 


TABLE 9 
ANALYSIS OF VARIANCE OF THE RANKITS FOR INDIVIDUAL TASTERS IN TABLE 8 
Sum of Mean 
Row Source DER squares square F 
1 Average Th 1 5.7600 10.64** 
2 ce L 1 1.6926 7.81* 
3 i Pie xt; 1 .5285 2.44 
4 Average non-additivity 3 . 7526 . 2509 1.16 
5 Tasters X Th 24 12.9956 0415 | 2.49** 1.00 
6 ¥ aan & 24 6.6400 . 2767 1.28 
‘4 f yor M aoe Be 24 4.5116 . 1880 .87 
8 $ X non-additivity | 72 15.6100 .2168 | 1.00 
9 Total 150 48.4909 


en 02 01, 77 Pea On0005; 


From the F values in Table 9, it is concluded that the tasters reacted 
quite similarly to the comparison of lead arsenate with parathion, and 
even more similarly to the interaction of insecticide by fungicide 
(Th X L). Their responses to thiram vs. sulfur (Th), however, differed 
significantly. Over all tasters, neither non-additivity nor the inter- 
action Th X L was significant. Though smaller than in Table 7, the 
F value for lead arsenate was clearly significant. The average effect of 
thiram, when tested against the interaction of tasters by thiram, was 
also smaller but still significant. 

A common problem in taste tests is the internal consistency of the 
subjects. Individuals whose scores are not additive presumably are 
poor subjects, and greater sensitivity should result if they could be 
eliminated. For each subject, an adaptation of Mosteller’s x” test 
for non-additivity with 3 degrees of freedom has been computed from 
the observed and expected angle for each of the six paired comparisons. 
Of the 25 tasters, only two had a significant x° at P = .05, oe 
number which would be expected by chance. When totalled, De Nghe 
85.50 with 75 degrees of freedom (P =. 0.2). So far as could be judged 
from x’, the tasters in this experiment did not differ significantly in the 


394 BIOMETRICS, DECEMBER, 1956 


consistency of their preferences, but with only six replicates for each 
taster, the test may have been too small to identify discrepant indi- 
viduals. 


A RANKIT ANALYSIS FOR DEGREE OF PREFERENCE 


Scheffé [1952] has extended the method of paired comparisons by 
asking the subject to report not only which sample he prefers in each 
pair, but also his degree of preference. This extension was included 
in the present experiment (Table 1). For analysis, Scheffé ranked each 
preference in his example from —3 to +3, but noted from the distri- 
bution of these ranks that scores would be preferred which reduced 
the difference between 0 and 1 and increased that between 2 and 3. 
In line with this suggestion, ranks here were transformed to the rankits 
for a series of seven (Table 5). The calculation of the analysis of vari- 
ance as given below has been simplified and extended from that reported 
by Scheffé. 


Analysis of vartance 


The original data, of one rankit for each single-pair judgement, 
formed a block of 900 responses, representing the six pairs of samples, 
six replicate tests and 25 tasters. The frequency distribution of the 
rankit response for the 25 tasters in each row of Table 10 represents one 
replicate of each treatment pair. The dip in the total frequencies at 0 
preference is similar to that noted by Scheffé in his experiment, but the 
small frequencies at the ends of the scale suggest that seven choices 
were about right. The sums of squares in rows 1 to 9 of Table 12 were 
computed from the >> fy in the last column of Table 10 and in the last 
two rows of Table 11, by equations similar to those used for Table 7. 
In each case, the divisor was the number of rankits y entering into the 
number that was squared, or 600 for each treatment effect and 150 for 
non-additivity. 

For computing the variation among tasters, the six responses y 
of each taster to each pair of stimuli were totalled to obtain the >> y in 
Table 11. The >> y in each row were then multiplied, in turn, by the 
factorial coefficients « (Table 6) for each treatment effect, ee from 
these sums >) (x >> y) their “interactions” with tasters in Table 12 
were computed. The “interaction” of tasters and non-additivity was 
determined as a difference, starting with the 150 values of >> y in 
Table 11. The frequency distribution of the y in the last line of Table 10 
gave the total sum of squares with 900 degrees of freedom and the error 
sum of squares as a remainder (Table 12), 
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TABLE 11 
ToraL GRADED Response or Eacu Taster In RANKITS Oe y) 
FOR THE Frrst or Hacn Parr or TREATMENTS IN Six Tusts 
The error sums of squares [y?], each with 30 degrees of freedom, were computed 
from the individual y for each taster. 


> y for specified pair comparison 

Taster [y?] 
Th:LTh Th:L Th:(—) LTh:L LThi(—) Ls—) |n = 30 
1 il ul id teal a7) 1.05 = LA 4.37 
2 —3.04 —3.74 —3.03 = By 2.63 1.93 7.27 
3 —1.46 41 1.52 =D) 1.81 = SY 5.37 
4. 76 1.75 .06 1.40 06 0 3.37 
5 == oe —1.52 41 2.63 2.98 = fhe 6.23 
6 3.04 = ie ata) 35 —1-46 82 10.67 
a = dill! 1.05 1.46 1.40 1.05 76 4.78 
8 1.29 —2.63 S00 = 25 3.81 OG 16.60 
9 0 2.81 1.52 0 Sly alee 11.80 
10 = il) 0 0 = ihaill 0 0 4.58 
il == orf!) 29 Al == 0 ==) 10) 5.36 
12 aa levi) Ss yal .70 1.40 2.10 eel 4.22 
13 = 6%) 35 0 1.40 1.05 70 1.94 
14 06 — 229 .76 70 .70 i tat 5.51 
15 = off) 70 = itl) 0 1.40 = 3.76 
16 eG 2.05 2.05 2.16 76 ill 9.02 
17 = 3) = 3 ie ai) = 7 0 0 4.08 
18 =lOS 1.29 = 62 2.22 1.05 = she) 10.04 
19 1.52 1.87 = 5A 1.81 41 2.22 7.30 
20 —2.05 —2.40 = oi) 58 SED = oe 2.22 6.62 
21 ele hry 70 3.05 4.04 5.92 2.22 7.67 
22 0 = .35 a oat —1.40 =F AD 35 3.33 
23 ea elif ay a 70 1.41 4,22 OS 14.75 
24 ihe 1.93 0 .76 .35 3.33 ‘to 
25 — nS 2.17 3.16 2.98 3.22 06 7.96 
Sum | —11.22 5.44 9.61 IIe 30.95 6.66 | 174.35 

Diff.* 3.74 —5.42 ari —7.59 =8.71 > = 10-3" 


*From the last column of Table 10, subtracting the Ds fy for the even-numbered tests from those 
for the odd-numbered tests; used in computing rows 5 to 8 of Table 12. 
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TABLE 12 
ANALYSIS OF VARIANCE FOR DEGREE OF PREFERENCE IN RANKITS, 
CoMPUTED FROM THE Data In Tastes 10 AND 11 
The F values in rows 1 and 2 have been determined with the mean squares in 
rows 10 and 11 respectively. The values of F in the last two columns are from a 
comparable analysis in ranks, without transformation to rankits. 


Sum of | Mean F from 
Row Source D.f. | squares | square F ranks 
1 |Treatment Th 1 7.0785 |7.0785 OFIGss io) pte 
2 Ms L 1 3.1378 |3.1378 GRO7 Gale 
3 Th XL 1 Saye | Sever | al28) 1.65 
4 |Non-additivity 3 .2352 | .0784 .34 oa 
5 |Order X Th 1 .8118 | .8118 | 3.50 3.85 
6 SN Sel Fi 1 .5624 | .5624 | 2.42 2.28 
7 SD Yh ane 1 .0002 | .0002 .00 .00 
8 ae 
X non-additivity 3 .6626 | .2209 295 103 
9 |Other test 
interactions} 24 5.2201 | .2175 .94 .93 
10 /Tasters XK Th 2A) 1825464) 7728.| 3.335 1.00 BA ey) 
11 - Sols 24 | 12 4104+) 5171 1|.2.23° 1.00 2.16** 1.00 
12 4 X Thx Li) 24) 4.9044 | .2044 .88 .98 
13 i 
X non-additivity| 72 | 16.5759 | .2302 .99 -95 
14 |Remainder or error| 720 |167.1007 | .2321 | 1.00 1.00 
15 Total 900 |237 .5787 


*P < 0.025, pea Qe () [ °P < 0.005, ce lze<n Oh) 0) Ue 


The single Table 12 includes all of the comparisons previously 
divided between Tables 7 and 9, where the extent of the preference 
was not measured. Each F for non-additivity (rows 4, 8 and 13) was 
now less than 1. Those for the interaction between treatments, Ti K L 
(row 3) and for the effects of order of tasting within pairs (rows 5 to 8) 
decreased. Conversely, the “interaction” with tasters of each direct 
effect of treatment, Th and L, increased to a high degree of significance 
(rows 10 and 11). Both main effects in rows 1 and 2 were compared 
against these “interactions”, leading to somewhat smaller but still 
significant F values. Measuring the degree of preference has thus 
increased our information on certain key points. Relative to the 
variability among tasters, the direct effect for the fungicide (Th) and 
for the insecticide (L) were significant at the 1 percent and at the 23 
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percent levels, respectively, so that the experiment was not needlessly 
over-size. 

Was anything gained by substituting rankits for the simple rank 
order, as measured by the digits —3 to 3? Rankits have the theoretical 
advantage of greater conformity with the model for the analysis of 
variance and are in line with the frequency distribution of degree of 
preference. For an empirical check, the analysis was recomputed with 
Scheffé’s original ranks, leading to the values for F in the last columns 
of Table 12. Both criteria gave substantially the same conclusions 
with few discrepancies. The effect of order upon the preference for 
thiram, for example, was apparently just significant (P = 0.05) in 
ranks, but not in rankits. Since the additional labor was not large, we 
would prefer a rankit analysis. 


Homogeneity of the variance among tasters 


In terms of simple preferences, the 25 tasters showed good internal 
consistency but with the introduction of degree of preference, the picture 
changed. The same variability appeared among individual tasters 
that has characterized other taste experiments in which samples are 
graded on a hypothetical scale. Among the 25 tasters participating in 
the present experiment, four found no difference in flavor which they 
would designate as more than “‘slight’’; nine found one or more differ- 
ences which they classed as “‘large’’. For a more precise measure, an 
“error” sum of squares, [y”], with 30 degrees of freedom was computed 
for each taster from the variation in y within each paired comparison 
(Table 11). The total of the 25 sums of squares was equal to the error 
sum of squares in Table 12 (row 14) plus the nearly equal interactions 
with order and-test in rows 5 to 9. Bartlett’s test indicated a highly 
significant heterogeneity in the variance among individual tasters, 
giving x” = 88.51 with 24 degrees of freedom, but since this test is 
equally sensitive to non-normality (Box and Andersen, [1955]), both 
may have been involved. 

The distribution of log [y”] was relatively symmetrical, so that the 
discrepancy among tasters could not be traced to one or a few aberrant 
individuals. It was due more probably to discrepancies in scaling, 
ascribable to differences either in taste acuity or in the interpretation 
of the terms “slight”, “moderate”, and “large’. No transformation 
of grades is likely to correct this type of variability among individual 
tasters. Any resulting heterogeneity in the error variance is not be- 
lieved to have invalidated the analysis in Table 12 in view of its con- 
sistency with the analyses in Tables 7 and 9, where scaling was not a 
factor, and of the known robustness of the analysis of variance as a 
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technique. We believe, instead, that reporting the degree of prefer- 
ence increased the information provided by the experiment. 


A COMPARISON OF ANALYTICAL METHODS 


As noted above, the rankit transformation, or mean deviate for 
the number of choices, is based upon the same assumed normal distri- 
bution of sensations or preferences within individuals or between 
individuals as the normal deviate postulated by Thurstone and Mosteller. 
The transformation of degree of preference to rankits is a simple exten- 
sion of Scheffé’s analysis in least squares. A method which differs 
much more in procedure is that described by Bradley and Terry (1952a,b, 
1953), in which the squared hyperbolic secant density, corresponding 
to a logistic curve (Hopkins, [1954]; Gridgeman [1955]), is substituted 
for the normal density in Thurstone’s model. They assume that the 
ranks assigned to any given pair of stimuli are distributed binomially, 
and have developed a rank order technique and tables (Bradley and 
Terry [1952a]) for determining by x’ the significance of differences 
among the initial stimuli. 

The present experiment has been analyzed by the Bradley-Terry 
technique (Table 13), both with the test or replicate as the unit, and 


TABLE 13 
x? ANALYSIS OF SIMPLE PREFERENCES BY BRADLEY-TERRY TECHNIQUE 


Source Dit x? ‘P 
Overall comparison of treatments 3 32.550 .0005 
Effect of order of presentation 3 5.099 .16 
Other replicate X treatment interactions 12 9.182 .68 
Agreement of panel members 72 112.462 .005 


with the taster as the unit. The four combinations of the two insecti- 
cides and two fungicides in the present experiment gave a highly signifi- 
cant x” among treatments and a significant discrepancy in the response 
of panel members, but no effect either of the order of presentation of 
the two samples within each pair or of other interactions of treatments 
with replicates. Recently, Abelson and Bradley [1954] have extended 
this method to the factorial analysis of a 2 X 2 design, so as to isolate 
the individual components now grouped in the first row of Table 13. 
Their calculation is so involved, however, that its availability is limited. 
It has not been applied to the present data. 

Kendall [1955] has proposed still a different approach, based upon 
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the matrix of “votes” cast for each competing treatment. Comparative 
ratings or scores can be determined either from the initial matrix of 
votes, or from the square of this matrix, which presumably reallocates 
the scores more equitably. His procedure has been applied to the total 
preferences in Table 4, with ratings not unlike those given by other 
techniques. 

All methods permit the linear scaling of treatment preferences. 
Their scales in the present experiment have been compared in Table 14 


TABLE 14 


ComPARATIVE ScaLH LOCATIONS OF THE TREATMENT COMBINATIONS IN TABLE 2, 
AND FactTortaL Errects CoMPUTED WITH THE SPECIFIED FAcTORIAL COBFFICIENTS 


Treatment combination | Factorial effect 
Row Scale 
LU. BING L (—) | Th L ThXL 
1 |Mean normal deviate, Mosteller|.108 1.000 —.409 —.699)2.216 1.182 .602 
2 |Rankits for replicates (Table 6)}.110 1.000 —.410 —.700|2.220 1.180 .600 
3 |Rankits for tasters (Table 8) |.084 1.000 —.412 —.672/2.168 1.176 .656 
4 |Degree of preference, ranks .044 1.000 —.294 —.750/2.088 1.412 .500 
5 |Degree of preference, rankits |.062 1.000 —.293 —.770/2.125 1.415 .461 
6 |Ln p; , deviates, Bradley-Terry|.039 1.000 —.399 —.640|/2.078 1.202 .720 
7 |Scores, Kendall matrix .111 1.000 —.426 —.685)2.222 1.148 .630 
8 |Scores, Kendall matrix squared |.181 1.000 —.441 —.740|2.362 1.118 .520 
Factorial coefficients Th 1 1 —1 —1 
for effect of L —1 1 1 —1 
Tie. Ly \—1. 1 —1 if 


by dividing each score, expressed as a deviate from the mean score, by 
the score for the most preferred treatment (LTh). The Thurstone- 
Mosteller technique, applied to each replicate separately and then totalled 
over the six replicates, gave the ratings in row 1. Nearly the same results 
were obtained directly with rankits from the number of preferred 
choices in each replicate (row 2). With the data for each taster converted 
to the rankits for a series of 7 (instead of 26), the results diverged more 
from the other two (row 3). The scalings differed still more when the 
degree of preference was considered, whether expressed in the simple 
ranks described by Scheffé (row 4) or in rankits (row 5). The logarith- 
mic scale from the Bradley-Terry approach, based upon the total 
number of choices for each treatment, and scales based upon the scores 
from Kendall’s matrix have been added for comparison (rows 6-8). 
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The results for the four treatment combinations have been combined 
with the factorial coefficients in the lower part of Table 14 to obtain the 
relative measures of the effect of thiram, of lead arsenate, and of their 
interaction in the last three columns. The interaction was largest on 
the Bradley-Terry scale, which also gave the smallest effect for thiram 
and an intermediate value for lead arsenate. In this experiment all 
scalings agreed sufficiently, however, that convenience and compre- 
hensiveness would rank high in selecting a procedure. 

The techniques seemed to compare more or less as follows. If 
a paired comparison is based upon a simple preference and is replicated, 
analysis with rankits is simpler and more flexible than with normal 
deviates. But if the data represent a single test with many subjects, 
Mosteller’s [1951b] x° test for additivity may be necessary. There 
seems to be good reason for balancing an experiment so that it can 
be analyzed both with the replicate as the unit and separately with 
the subject as the unit. Additional information is gained by recording 
the degree of preference, despite heterogeneity among tasters in their 
error variances, and here we would prefer the calculation in rankits. 
For practicality, the Bradley-Terry analysis depends upon special 
tables, which they have published, but it seems to allow fewer compari- 
sons than the other techniques. We have not studied the Kendall 
analysis sufficiently to warrant an opinion. 


SUMMARY 


Paired comparisons, in which each treatment is tested in pairs 
with every other treatment, are well adapted to comparative studies 
of the effects of chemically different pesticides upon the flavor of a food. 
The palatability of sauce made from apples, sprayed in a 2 X 2 factorial 
experiment with two different insecticides and two different fungicides, 
has been evaluated by this design. Six pairs of samples, representing 
all possible treatment combinations, were tasted on each of six test 
days by the same 25 subjects, each subject recording both his choice 
within the pair and the degree of his preference as suggested by Scheffé. 

For analysis, the deviate of the normal curve in the Thurstone- 
Mosteller model was replaced by the rankit, defined as the mean deviate 
for the ranked items in a sample from a normal population with mean 
zero and unit standard deviation. Two analyses were computed from 
the number of choices for one of the treatments in each pair: (1) with 
the replicate as the unit and the rankits for a series of 26 as the variate, 
and (2) with the taster as the unit and the rankits for a series of 7 as 
the variate. The degree of preference reported for each test pair was 
also converted to rankits for a third analysis. 
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The three analyses of variance complemented one another. In 
every case, there was satisfactory additivity of treatment preferences 
on the rankit scale; the order of tasting the two samples comprising 
each pair had no significant effect upon preference; and the results on 
the six replicate or test days were mutually consistent. Sauce prepared 
from apples sprayed with lead arsenate was preferred to that sprayed 
with parathion, and, even more markedly, thiram was preferred to 
sulfur. The flavor associated with the insecticide did not modify that 
due to the fungicide, the interaction not being significant in any test. 
However, individual tasters differed in their flavor preferences for the 
direct effects of treatment, the ‘‘interaction” of tasters and insecticide 
being significant when the degree of preference was reported, and the 
‘““mteraction” of tasters and fungicide highly significant by both pro- 
cedures. Recording the degree of preference gave somewhat more 
information than the simple choice between the two samples of each 
pair, despite a significant heterogeneity in the variance within tasters. 

For comparison, the same data were analyzed in varying degree 
by four alternative methods: the Thurstone-Mosteller, Bradley-Terry, 
Kendall, and Scheffé techniques. The results have been compared in 
terms of the relative scale positions of the four treatments and of their 
factorial combinations. Despite differences in the underlying model 
and method of analysis, the treatment rankings on a preference scale 
were substantially the same. A choice among them in this case could 
be determined by convenience and the ability of the technique to extract 
the required information with the least computational eer Rankit 
analysis seemed to meet this need effectively. 
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OF SELF FERTILIZED POPULATIONS* 
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AND 
CuHarLes R. WEBER 
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1. INTRODUCTION 


Plant breeders frequently classify the material of a generation into 
groups according to the ancestry of the group. Replication of the groups 
permits the variance component due to that type of grouping to be 
estimated. Groups of one generation may be related to groups of 
another generation through common ancestry and hence covariances 
can be estimated. Such estimated quantities are functions of the 
underlying genetic mechanism and in particular of genotypic variances 
and covariances. They thus provide a basis for inference about the 
genetic mechanism. 

Genotypic variances and covariances have been defined and derived 
under various assumptions by numerous authors; notably Fisher [1932], 
Mather [1949], Horner [1955], Gates [1954] and Kempthorne [1956]. 
The definitions have been based on infinite populations. These may be 
divided into sub-populations whose genotypic composition and relative 
frequency are known from genetic theory. The actual magnitude of a 
genotypic variance or covariance depends on the value assigned to each 
genotype. ‘This depends on the physiology of the expression of the 
genotype and is usually a function, simple or complicated, of the genes 
of the genotype. 

The relationship of sample covariances (or sample variance compo- 
nents) to genotypic variances and covariances is not always obvious. 
The expected values of the sample covariances may be a function of 
more than one genotypic covariance. The relationship is often further 
obscured by the fact that actual plant populations are finite and a 
defined infinite subpopulation may be represented by only a few plants. 
The problem of relationship becomes particularly acute in populations 
which reproduce by self fertilization. These give rise to populations 
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having a hierarchical structure and hence to a whole set of genotypic 
variances and covariances. 

The objective in the present paper is to describe a formal procedure 
for obtaining the relationship of sample covariances and variance com- 
ponents to genotypic variances and covariances. This procedure is 
described with reference to an experiment conducted at the Iowa 
Agricultural Experiment Station on a soybean cross using data from 
the F, through the F; generation. 


2. DESCRIPTION OF SOYBEAN EXPERIMENT 


Homozygous soybean varieties Adams and Hawkeye were crossed in 
1947. Their F, was produced in 1948, their F, in 1949, etc. For the 
purposes of this study, soybeans are nearly completely self fertilized. 
The experiment has the structure indicated in Table 1. 


TABLE 1 
STRUCTURE OF THE DaTa 


Year Population Structure 

1949 FP, Ai ‘" tee oo a 
| 

1950 F; B, Be B; Bos 


1951 Ba Ce Cz Comees G; 


1952 F; ID DES D, Dé iN 4 a De, 
| 

1953 Fs Ei; E¥ E, EF ie ie Ex, 
lon 

1954 Fy, i ee F, FF Te its Fo FG, 


Unselected progeny: *. Selected progeny: no star 


Ninety four random fF, plants, indicated by A’s in Table 1, were 
selected. The 94 progenies in the /; generation, indicated by B’s, were 
grown in a simple lattice design together with parents and bulk popula- 
tions. Two plants were selected at random from each I, progeny. 
The 188 resulting /, progenies, which are indicated by C’s in Table if 
were again grown in a simple lattice design. To obtain the F, generation 
one of the two progenies in the F’, generation tracing to each particular 
F, individual was selected at random. ‘Two random plants of each 
selected progeny produced F’; progenies. The cycle was then repeated. 
Selected progenies are indicated in Table 1 by the absence of a star. 
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The letters B through F in addition to being used to represent progenies 
will also be used to represent adjusted phenotypic progeny means, and 
the letter A will also be used to represent a measured value of an F, 


plant. 


TABLE 2 
SamMpLE COVARIANCES AND VARIANCE COMPONENTS AND THEIR ExpEcTED VALUES 


Sample covariance Expected value 


A ae B Cov (2; 2, 3) 
A 2(C + C*) Cov (2; 2, 4) 
A * 47D + D*) Cov (2; 2, 5) 
A= $C) Cov (2; 2, 6) 
Ae. © (Fear) Cov (2; 2, 7) 
B« 4(C +:C*) Cov (2; 3, 4) + a Cov (8; 8, 4) — a Cov (2; 3, 4)TT 
Bao et(D ELD) Cov (2; 3, 5) + a Cov (8; 3, 5) — a Cov (2; 3, 5) 
B “ i(# + E*) Cov (2; 3, 6) + a Cov (8; 3, 6) — a Cov (2; 3, 6) 
Beas (Hate) Cov (2; 3, 7) + a Cov (3; 3, 7) — a Cov (2; 3, 7) 
Ces (Dee Dt) Cov (3; 4, 5) + b Cov (4; 4, 5) — b Cov (8; 4, 5) 
C “ (H+ #*) Cov (3; 4, 6) + b Cov (4; 4, 6) — b Cov (8; 4, 6) 
CO ak (FEE) Cov (8; 4, 7) + b Cov (4; 4, 7) — 6b Cov (8; 4, 7) 
D “ 4(# + E*) Cov (4; 5, 6) + c Cov (5; 5, 6) — c Cov (4; 5, 6) 
D “ if + F*) Cov (4; 5, 7) + c Cov (5; 5, 7) — ¢ Cov (4; 5, 7) 
eco (ft es) Cov (5; 6, 7) + d Cov (6; 6, 7) — d Cov (5; 6, 7) 
c* “ 2(D + D*) Cov (2; 4, 5) 
C* “ i(H + E*) Cov (2; 4, 6) 
ce “ icf + F*) Cov (2; 4, 7) 
D* “ (f+ E*) Cov (8; 5, 6) 
D* “ i(F + F*) Cov(335;.0) 
E* “ 2i(F + F*) Cov (4; 6, 7) 

Sample 
Year Population variance component Expected value 
1950 F; Among A groups Cov (2; 3, 3)t 
1951 F, Among B groups Cov (2; 4, 4) 
1951 F, Within B groups Cov (8; 4, 4) — Cov (2; 4, 4)T 
1952 F; Among C’ groups Cov (8; 5, 5) 
1952 Fs Within C groups Cov (4; 5, 5) — Cov (8; 5, 5)t 
1953 Fs Among D groups Cov (4; 6, 6) 
1953 Fs Within D groups Cov (5; 6, 6) — Cov (4; 6, 6)T 
1954 F, Among F groups Cov 6; 7, 7) 
1954 F, Within groups Cov (6; 7, 7) — Cov (5; 7, 7)t 


{Strictly speaking year X genotype interaction components should also be included in these 
expected values. 
tta, b, c and d are the reciprocals of the harmonic means of the numbers of plante J in B, Gy .D, 


and £ respectively. The harmonic means are around 100 and hence the terms involving these co- 
efficients are negligible, 
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In this experiment there were 21 sample covariances and 9 variance 
components which were unconfounded with environmental effects and 
which could be given a genetic interpretation. These are listed in 
Table 2 along with their expected values. 


3. INTERPRETATION OF 
GENOTYPIC VARIANCES AND COVARIANCES 
The meaning of the symbolization Cov (2; 2, or in general 
Cov (k;n, n’) is indicated by Figure 1. Cov (k; n, n’) is the genotypic 
covariance of progenies in the nth generation from par ticular genotypes 
in the kth generation with progenies in the n’th generation from those 
same genotypes of the kth generation. 


FIGURE 1 
INTERPRETATION OF Cov (k; n, n’) 
Generation Genotypes and progenies 
k 


. fe 
+ Loo 


M and M’ are genotypic progeny means of progenies in the n and n’ generations 
which are descended from genotypes in the kth generation. 


Cov (kj n, n’) = Cov (M, M’) 


All genotypic variances and covariances of populations having the 
hierarchical structure described can be expressed as linear functions of 
the Cov (k; n, n’). With this notation, the genotypic variance among 
F, plants is Cov (2; 2, 2), and the covariance of Ff, plants with their 
progenies in the F, is Cov (2; 2, 3). The variance of 7; progeny means 
is Cov (2; 3, 3) and the average variance within these progenies is 
[Cov (3; 3, 3) — Cov (2; 3, 3)]. 

The interpretation of fee Cov (k; n, n’) isa logical consequence of 
the genetic assumptions. As an example of interpretation, consider 
Cov (3; 4, 5); that.is, the genotypic covariance of progenies in the F’, 
and F, generations which trace to particular genotypes in the F’; genera- 
tion. Letting p, represent the frequency of the ¢th genotype in the F; 
and letting y,, and y;.. represent the respective /, and F; progeny 
means, then Cov (3; 4, 5) = Dy: Dilys. — y..)(Ye.. — y...) where y.. 
and y... are the genotypic means of the /, and F generations. In this 
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notation the generation with which a symbol is associated is indicated 
by the number of subscripts of the symbol. Each successive generation 
has one more subscript. Taking the case of a single locus with two 
alleles, let the values of the three genotypes be z + 2u, z + u + au 
and z. From genetic theory it is known that the frequencies of the three 
genotypes in the F, are 3/8, 2/8 and 3/8. The genotypic progeny means 
in the F, generation are z + 2u, 2 + u + 1/2 aw and 2; and in the F; 
generation, zg + 2u, z+ u + 1/4 aw and z. Thus, Cov (3; 4, 5) = 
3/4u? + 3/128a7u? = (3/2)o% + (3/32)om , where o4 and op are the 
additive genetic and dominance variances of the Ff, generation. 


4. DERIVATION OF THE COVARIANCE OF C WITH 3(D + D*) 


4.1 Notation and model for an adjusted progeny mean: 


The covariance of C with }(D + D*) is derived in this section to 
illustrate the derivation of covariances and variance components. For 
this purpose C; will be represented by 


aa vi; 


“3. = 
a N; 


where 2,; is the value of the jth plant in C; and n, is the number of 
plants in C;. For D; we will use 


= ‘S: Vijtk 


Lizr. 
Nj" 


where 2;;-, 18 the value of the kth plant in D; . The prime on the sub- 
script j denotes the fact that the D; progeny is from a particular plant 
in C; , the 7’ plant say. The 7’ plant is one of the two plants which was 
selected at random in C; . The other will be indicated by j”. For 
D*¥ we will use 


* a Vijeth 


Viger, 
Ngzrr 


The expected value of the sample covariance is 


E{Gov [C, 3D + D*)}} 


> (a. — Ee + Layer, — LEDs = L.5",) 
BSS ee ee ee ee eee ee 


Nole 


n—l 
Ree We Fone 


which can'be evaluated most easily by making use of the equivalent 
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expression 


E{Gov [C, 1D + D¥]} 
1 
= 2n(n — 1) 2» DE l(a. a Lr )[F(te5e. + Biger, — Bye, — Larger) |} 


ply! / 
= ahr 1) dX 2) E{PP }, say 


i’ sa 


which thus reduces the problem to that of finding the expected covariance 
associated with individual degrees of freedom. 

The model for z;. is z;, = yw, + 2;, + e; where », and ¢ have the 
usual meanings and 2z;, is the mean of the genotypic effects in the C, 
progeny assumed to be random. The model for z;;-. is similar except 
that uw, replaces yu; and z,;-. replaces z;,. Strictly speaking the models 
should have an additional term for year X genotype interaction. For 
present purposes, this term is not considered. Only certain of the 
sample variance components would have their expectations changed 
under the more complex model. 


4.2 Genotypic covariance of observed progeny means: 


Assuming that the environmental effects of one year are uncorrelated 
with those of another year, then HPP’ reduces to 


EPP’ = 3{E@: 25'.) + EG: ii.) 
+ Ei: B05.) + EC Birjer.) — EE 24+.) 
— Ee; 2500.) — E@y 25°.) — BGs Bij}. 
By the procedure to be described subsequently it can be shown that the 
four terms entering with a plus sign are equal while the other four are 


zero and thus EPP’ = 2E(z;.z;;,. ). This genotypic covariance of ob- 
served progeny means can be written as 


Cae ee ee eit Sie toa) 
E@; 23°.) = x Ns Ni" 


which reduces to 


Dee Ly eps hae > Bless 2051): 


v 


E(é,;2:;'%) is the genotypic covariance of a plant in C; with a plant in the 
progeny of asib. E(z;;-2,;+,) is the genotypic covariance of a plant in C; 
with a plant in its own progeny. 
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4.3 Model for genotypic effect of an observed plant: 


The genotypic covariances of observed plants, H(z;,;2;;-,) and 
E(2;;:2:3',), can be related to population genotypic covariances. To do 
this let 6? be a random variable that assumes the value one when the 
genotypic effect of the jth plant in C; corresponds to the genotypic 
effect of the wth genotype in the progeny of the ¢th genotype in the F, 
generation and zero otherwise. Likewise let 6:25 be a random variable 
which assumes the value one if the kth plant in the progeny from the 
jth plant in the progeny of C’; corresponds to the genotypic effect of the 
vth genotype in the progeny from the uth genotype in the progeny from 
the 7th genotype of the Ff, generation and zero otherwise. 

The properties of the 6’s are known from genetic theory. For example 
E(6::) = D:Pi. Where p, is the probability of the ‘th genotype in the 
F, generation and p,, is the probability of the wth genotype in the pro- 
geny of the fth genotype given the ¢th genotype. Similarly E(zii‘) = 
PiPiuPtuy and 


PiPtuPt’Pt'u'Pe'u'n for 4’ Se a 


E(si ese PtP tuPtu'Ptu'v for e" ae 2; be t and 7 Fj 


t/u’y 


WOM IN Orem for 2’ = 74, % = t,7’ =j,andw =u 
0 otherwise 


The genotypic effects of observed plants are now related to popula- 
tion genotypic effects by the equations 


a= Dd DD Ya — y.) 
 — x oF, Dy pl a ves) 


where ¥;,, is the genotypic value of the uth genotype in the progeny of 
the ith genotype, y.. is the genotypic mean of the Ff, generation and 
Yiu» and y... have similar meanings. It follows that 


E21) = Ef > dX sd (Te a he ye De pe bie aes ty ay 


After simplifying the right hand side by making use of the properties 


-_ of the 6’s, we have 


E@ 2:3) = Di Pi. OY, ok te ae Toa) 
Cov (3; 4, 5). 


In a similar manner it can be shown that 


EQ i237) = Cov (4; 4, 5). 
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4.4 Expected Value of Gov [(C, (D + D*)/2): 


This expected value is then 


1 n, — 1 
2n(n — 1) 2» > 2f( ‘ 


ie 
x ) Cov (3; 4, 5) nk [Cov 434, 5) 
However, Cov (4; 4, 5) represents the total genotypie covariance of the 
F’, and F; generations. Hence Cov (4; 4, 5) contains Cov (3; 4, 5) plus 
an additional covariance that might be represented as Cov (4; 4, 5) — 
Cov (3; 4, 5). Making this substitution and simplifying, we have 


E{Gov [C, 4(D + D*)]} 
= Cov (8; 4, 5) + b[Cov (4; 4, 5) — Cov (8; 4, 5)] 
where 6 is the reciprocal of the harmonic mean of the n; , which is the 
number of plants in.C; . 
5. NUMERICAL EXAMPLE 


The derivations above have shown that the expected values of the 
sample covariances are linear functions of the Cov (k; n, n’); it remains 
to be indicated how actual genotypic models are fitted to the data. 
Accordingly, let Y represent a sample covariance. Then 


Y = Cov (k3 n,n’) + « 
where e¢ is an error. 
For instance, if an additive model with dominance with additive 
additive interaction, two alleles per locus, no linkage and a gene fre- 
quency of one-half is assumed, then it can be shown that 


k-1 ae Oras a? 1 Oe eS 1 2 
Cov (k; n,n’) = (a el me (es ae 4) Caden 


If one makes other assumptions then Cov (k; n, n’) will be different. 
Assuming the above model, however, and defining 


Bi =o4, Bo = oD, 6. 
a 1 Cea 1 Die ae il 2 
2. ehceey as aaa As = a? and Xs = \oF7_] > 


the model for the sample covariance becomes 
Y= BX, —- B2Xo 33 63X35 fe 


and least squares procedures can be utilized to give estimates of 6, , 
6, and 6, . Values of the X’s for the thirty sample covariances are 
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shown in Table 3 along with the sample covariances for the character 


tive X additive interaction 


maturity. This character maturity was the number of days after 
August 31 that 95 to 100% of the pods in a plot were ripe as measured 
by eye. 
TABLE 3 
OBSERVED AND ESTIMATED VALUES OF COVARIANCES 
Sample 
covariance Estimated value 
Covariance Xi Xo X3 Ya YA Y, Vs 
Cov (2; 2, 3) 1.00 BOO) el OOM Ora LOPS Sa G6 seen io 
Cov (2; 2, 4) 1.00 PA EAD fell 10.9 iL OS UR 
Cov (2; 2, 5) 1.00 cae OOR a5 LORO Set Ope elit 
Cov (2; 2, 6) 1200) 206.74 00" 6.0) dO. 080 10 Oe ei 
Cov (2; 2, 7) 1.00 103) eal 0055 /.6 10.9 10.9 1-0 
Cov (2; 3, 4) 1.00 sliey 1 AOU 85 10897 SROs tek 
Cov (@; 3, 5) 1.00 .06 1.00 14.2 10.9 1029 iO) 
Cov (2; 3, 6) 1.00 POS mete COREL ai O89 cat OPO Sate) 
Cov (2; 3, 7) 1.00 S02. L000 13:8 10.9 10.9 TO 
Cov (3; 4, 5) 1.50 2008 282080 L0.6 16647 6545-9 1625 
Cov (8; 4, 6) 1.50 U5. PO sae LOA SiGe. 16.4 
Cov (8; 4, 7) 1.50 (O2me2e2oue Lone 16:34 el653 16.4 
Cov (4; 5, 6) 1.75 S05 mtonUG meloaO 1 enk 19.1 19.1 
Cov (4; 5, 7) 7s UB) AiGy BA 19 Os 0 eel OL 
Cov (5; 6, 7) 1.88 UR. SSP EOE} 20.5 20.4 19.3 
Cov (2; 4, 5) 1.00 LOS mee OUR LIers LORS EST OFS mei) 
Cov (2; 4, 6) 1.00 02> 1.00.5 927 1079 9 1089" Sa1120 
Cov (2; 4, 7) 1.00 [OTS Ue OO te led. LO SO nO 9 aoe ia) 
Cov (8; 5, 6) 1.50 SO2Me2 Zonet. 0. 1 G24 el Geom Oa: 
Cov (8; 5, 7) 1550 = Ol eZ oe sae LGs4 5 1623 BeelGee: 
Cov (4; 6, 7) aera RULE aU Om GRo Hal wikeyeta) © akeye a 
Cov (2; 3, 3) 100-2423. “1.00% 19.8 1920.9. 91159 bento 
Cov (2; 4, 4) 1.00 .06 1.00 10.9 10:97 10297 20050 
Cov (3; 4, 4) —Cov (2; 4, 4) HA Solis al eta, Atala 5.5 5.6 5.5 
Cov (3; 5, 5) 150. .05. 2.25 -20.0> | 16i4y - 16.348 16 ek 
Cov (4; 5, 5) —Cov (8; 5, 5) EO gt: OG Rear S lems 24 228 2270 
Cov (4; 6, 6) Ai Boies rOOm alice lt LOS ee OF OMe Oot 
Cov (5; 6, 6) —Cov (4; 6, 6) 13 .03 .45 3,0 1.4 1.4 1.4 
Cov (5; 7, 7) Weeey Seep pals 2075) e204 S204 
Cov (6; 7, 7) —Cov (5; 6, 6) LOGi5 e020 ee 24 et. ate eye aif 
tz 9.9623 9.9625. 962 7, 
Yi, = 10.92X, Additive Model 
Y. = 10.85X; + 1.43X, Additive Model with dominance 
Y; = 11.06X; + 1.07X, — .10X; Additive Model with dominance and addi- 


ee 
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The three Y columns shown in Table 3 represent the population 
covariances as estimated from linear regression, associated with a 
completely additive model, an additive model with dominance, and an 
additive model with dominance and additive X additive interaction. 

A detailed interpretation of this set of data will not be given here. 
It might be noted, however, that the completely additive model fits 
this set of data as well as the two more complex models, explaining 
about 96% of the variation in the dependent variable. 

Confidence limits for the Cov (k; n, n’) and prediction limits for the 
sample covariances in the case of the completely additive model are 
shown in Table 4. In interpreting the data one must bear in mind 


TABLE 4 
CONFIDENCE AND PREDICTION LIMITS FOR THE CoMELETELY AppiTIvE Mopru 


Confidence limits for Prediction limits for 
Coefficient of o% , population covariances sample covariances 
Xy k ls h ls 

.0625 .63 vhs) —4.9 6.3 
1 AS 1.26 1.47 —4.3 70) 
B25 2.53 2.94 —2.9 8.4 
5 5.05 5.87 — 2 alg Rey 
1.0 10.1 AGM ef 5.2 16.6 
1s yee 1ie6 10.6 22.1 
7s 177 20.5 13.3249 
1.875 18.9 22.0 14.6 26.3 


that certain of the estimates of the variance components may be biased 
upward because of the ignoring of genotype X year interaction. The 
estimates of the 21 sample covariances and the 9 sample variance 
components are not completely independent. Linkage may be an 
important factor in the early generations and natural selection pro- 
gressively more important in the latter generations. The F, data were 
individual plant values, while the data of other generations were based 
on plot values. 


SUMMARY 


This paper is concerned with the derivation of the expected values of 
sample covariances and variance components in terms of genotypic 
variances and covariances for populations produced from crossing two 
homozygous lines and subsequent self-fertilization. Such genotypic 
variances and covariances have been defined by authors including 
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Fisher, Mather, Horner and Kempthorne. Use is made of random | 


variables such as 6; , where 6; = 1 if the genotype of the plant labeled 
7 in an experiment corresponds to the ¢th genotype in a list of possible 
genotypes and zero otherwise. Application was made to maturity 
data of a soybean experiment. In this experiment 21 covariances and 
9 variance components, unconfounded with environmental effects, 
were estimated. A completely additive model fitted this set of data as 
well as did two more complex models, explaining 96% of the variation 
among the sample covariances and variance components. 
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INTRODUCTION 


The principles upon which the biometrical analysis of continuous 
variation can be based, have been set up by Fisher et al. [1932] and 
Mather [1949a,b]. The method Mather gives allows the estimation of 
the additive (D), dominance (H) and environmental (£, and £,) 
components of variation, and also the detection of linkage. Interac- 
tions, although not explicitly allowed for, can be detected as additional 
sources of error variation. 

As it has become obvious that interactions are one of the more 
important contributory factors of variation, general approaches allowing 
the estimation of these effects in different generations have been de- 
veloped by Cockerham [1954], Kempthorne [1955] and Hayman and 
Mather [1955]. The approach adopted by the last authors is extended 
in the present paper to deal with families raised after backcrossing an 
F, family to its F, and the two parents, with a view to determining 
the value of these generations for separating the effects of linkage and 
interaction. The resulting method has been tested by application to 
an experiment yielding data of this kind. 


EFFECTS OF INTERACTION AND LINKAGE IN 
BACKCROSSES OF F, TO Ff; AND PARENTS 


Contribution to first degree statistics 

Using the notation described by Hayman and Mather [1955] for 
expressing the effects of interaction between two genes, the mean 
expression of families raised by backcrossing F, to F, and parents 
(P, and P,) is as shown in Table 1. The mean of F, is shown for com- 
parison. 

In the presence of linkage the mean values of F, and F, X F, ob- 
viously depend on interactions and the genic distribution between the 
actual parents, but when p = 3 these two effects disappear. 
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TABLE 1 
Errrcrs or LINKAGE AND INTERACTION ON THE MEANS 


Generation Mean expression, 
Fi 4[ha + hel  Ftavjr + 41007? 
F, X Fi $[ha + he) + Fasir(2 — p) + Sljavr?q 
Fy X Pi Ada + ds + ha + ho] + ¥awi(1 + gr) — Sar(+Jais & Jojo + lav) 
F. X Pe 1[—da = dy + ha + hs] % Ftas)(1 + gr) + 4qr(+dals + Joa  Lja0) 


Where alternative signs appear, the upper refers to association and the lower to 
dispersion. 

p = recombination value, q = =p), r = (1 — 2p). 

Notation and symbols used in this and subsequent tables follow Mather [1949a, b] 
and Hayman and Mather [1955). 


The mean expressions of F, X P’s are on the other hand dependent 
on interactions and the distribution of genes between the actual parents 
both in the presence and absence of linkage. The effects of association 
and dispersion appear as a change of sign in the interaction and additive 
terms. 

The expectations for the various scaling tests allowing for the effects 
of interactions but not linkage, have been given by Hayman and Mather 
[1955]. These have been extended in Table 2 to tests involving (i) all 
selfed generations up to F’; and (11) backcrosses of F, to F, and parents. 
The latter test is capable of detecting pure 7-interaction whereas the 
former also includes a portion of the /-type interaction. The table also 
shows two scaling tests based on second degree statistics. (cf. constancy 
of W-V in diallel cross analyses (Hayman [1954]; Jinks [1954]). 


Contribution of two interacting genes to second degree statistics. 


The contribution of two interacting genes to second degree statistics 
derived from backcrosses of F, is shown in Table 3. The corresponding 
expressions for other generations, considered later in this paper, are 
also included. To avoid terms involving products of additive and 
dominance effects the variances and covariances from the backcrosses 
of F, to parents are summed over P, and P, (cf. treatment of back- 
crosses, Mather, [1949] thus leaving three statistics, viz. the variances 
within and between families and the covariance with F, parents. Three 
corresponding statistics are provided by the backcross of F, to F, . 

The variances and covariance from F, X F are simple in expression 
and obviously informative, the total effect of interactions appearing 
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as separate quadratic items and so being unaffected by sign. Further- 
more, Vireri: and Wy,r.r, include nothing but simple additive effects 
and i-interactions. They are therefore likely to provide a direct measure 
of the fixable variation. A further advantage of these statistics is their 
independence of genic distribution between the actual parents. 

In the case of summed variances and covariance from the back- 
crosses to the parents a portion of the effects of interactions still appears 
as separate items, and the covariance is unique in having all the effects 
of 7- and /-interactions expressed in this way; whereas the total effect 
of the j-interactions is confounded with the additive component. 

The two variances have corresponding separate interaction expres- 
sions but different coefficients. They are, however, in complete agree- 
ment with regard to the compound terms where a portion of the j-inter- 
actions is confounded with the additive component and a portion of 
z-interaction with the dominance component. The statistics obtained 
from these backcrosses are of course dependent on the genic distribution 
in the parents. 


The contribution of two linked genes to second degree statistics. 


The definitions of D and H in respect of two linked genes for different 
generations are given by Mather [1949]. Using his notation the expecta- 
tions of the additive and dominance components in the statistics 
obtained from the present backcrosses of F, are given in Table 4. 
Compared with linkage terms for D and H in other generations pre- 
sented earlier and also given in the table, these six new statistics require 
only one further definition of D and H for complete specification, viz. 


Vorori 
D = 3[d? + ds + 2d,d,(1 — 2p)(1 — 3p)] 
and 


H = 3[he + hy + 2h ho(1 — 2p)"(1 — p)] 


ESTIMATION OF THE EFFECTS OF INTERACTION AND LINKAGE 


Separation of linkage and interaction effects 


The original analysis into D, H, #, and #, components of variation 
and the subsequent tests for linkage and residual interaction, do not 
include interactions as components, whereas a modified method 
proposed by Hayman and Mather [1955] takes interactions explicitly 
into account. They point out one important distinction between linkage 
and interactions in expression, viz. that in each mating system linkage 
terms change with rank of statistic but not with generation, whereas 
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interactions tend rather to cause heterogeneity between generations 
within ranks. They show, however, that in those generations where 
d-interactions do not affect the definitions of D and H this type of 
interaction may mimic linkage effects. 

The successful separation of linkage and interaction effects depends 
of course on their independence in expression. Accordingly the problem 
has been reexamined by calculating the correlation between linkage and 
interaction expressions for the statistics under the conditions outlined 
below. The fourteen statistics involved include the six under considera- 
tion, six deriving from selfed and biparental progenies of the third 
generation and the remaining two from F, and first backcrosses. 

Duplicate and complementary types of interactions are considered 
as defined by the following conditions: 


Duplicate: Complementary: 
d, = d, —= he me h, = — — Fa») d, = dy, r= he =a hy i 34401 
Vad! — Ja\s = Joia a Liat Lab| — Gale = Jo\a a Las 


A further instance is considered where the description of the dupli- 
cate type of interaction is modified to allow for the absence of dominance, 
Hew a—s aa 

In all cases the recombination value p = 7 

The results of these calculations are summarised in Table 5 and 


: TABLE 5 
CoRRELATION BETWEEN LINKAGE AND INTERACTION EXPRESSIONS 


Type of interaction 
Gene distribution 


“Duplicate” 
Duplicate Complementary no dominance 
Association 0.672 —0.599 0.187 
Dispersion —0.387 0.586 0.044 


indicate that under certain circumstances up to approximately 40% 
of variation legitimately attributable to interaction may be taken up 
by allowing for linkage and vice versa. 


The estimation of linkage and interaction 


Sets of data appropriate to the statistics under consideration can 
be analysed by the inclusive/exclusive form of analysis outlined by 
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Mather [1949] and Mather and Vines [1952]. Thus fitting least squares 
estimates of the four components D, H, and £, and £, will yield a 
S.S. for deviations depending on the effects of linkage and interaction. 
After fitting extra parameters to allow explicitly for linkage, any 
significant reduction in this residual 8.8. must indicate the presence of 
linkage effects, provided there is no correlation between linkage and 
interaction in expression. The S.S. remaining after allowing for linkage 
can be ascribed unambiguously to interaction effects. 

A correlation has however been demonstrated between the effects 
of certain types of interaction and those of linkage. Allowing for linkage 
in the presence of interactions and vice versa may therefore lead to an 
inflation of the primary effect at the expense of the residual variation. 
The solution obviously would be to carry out two analyses, one allowing 
explicitly for linkage and the other allowing explicitly for interactions. 
Then in each case the remainder item could be attributed unambiguously 
to the effect not accommodated. This would leave a balance of varia- 
tion (i.e. the difference between the inclusive residual §.8. and the sum 
of the two remainder items) which could not be assigned specifically 
to either effect. The complete specification of all digenic interaction 
effects requires however, nine extra parameters whereas linkage effects 
require but three (Tables 4 and 6). Therefore a sequential analysis 
of the kind given in Table 7 would seem easier and has in consequence 
been adopted. A + entry in the upper part of this table indicates a 
significant item as compared with basic error while ++ indicates a 
mean square significantly greater than the residual item. The lower 
part of the table gives the interpretation. 

By allowing first for linkage no difficulty in interpretation arises in 
the cases shown in the first three columns and further analysis would 
not be justified. The fourth case is, however, obscure and in this 
instance the data should be reanalysed allowing explicitly for inter- 
actions in place of linkage. The final results are then interpreted 
according to the lower part of the table. 


ILLUSTRATION BY EXPERIMENTAL DATA 


~ Structure of the experiment 


All the data used in this account are from selfed generations and 
backcrosses of Ff, to F, and parents, derived from the cross between 
varieties 1 (denoted P,) and 5 (P.) of Nicotiana rustica. (Mather and 
Vines [1952]). 

In 1953 50 plants taken at random from the F, were each selfed 
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and, in addition, backcrossed to P; , P, and F,. This yielded seed for 
50 families of the F, and of each F,-backcross generation. 

The experiment, which was laid out in 1954, consisted of two blocks, 
each made up of the two parents (5 plots each), F, (10 plots), F, (12 
plots), F; , F2 X P, , F. X P, and F, X F, , all with 50 plots each. 
The plots, each containing 5 plants, were randomized separately within 
each block. 


Means and scaling tests 


The mean heights and flowering times are given in Table 8. The 
standard errors attached to the means are calculated from the variation 
between the replicate plot means. 


TABLE 8 


M#an Hetcuts AND FLOWERING TIMES 


Flowering time 

Generation Height (inches)- (days from 1/7) 
Py 41.9 + 0.93 32.1 + 0.99 
P2 44.7 + 1.24 35.5 + 0.60 
M 43.3 + 0.78 33.8 + 0.58 

Fy 49.8 + 0.65 31.2 + 0.84 

F, 46.7 + 1.12 32.8 + 1.45 
F3 44.7 +0.54 34.4 + 0.78 

F. X Pr 45.2 + 0.34 33.5 + 0.52 
F, X Pe 46.6 + 0.45 34.8 + 0.48 
FPF. X Fy 46.2 + 0.41 34.8 + 0.51 


M = average of P; and P,. 


Since in previous data from the same crosses, no significant reciprocal 
differences have been found, the results from reciprocal crosses are 
pooled in this account. a 

The results regarding the mean values for the two characters are 
in satisfactory agreement with the earlier findings, viz. that Fy shows 
heterosis in height and is almost midway between the parents in flowering 
time. ea 

Departure from simple additivity of the genes concerned may be 
detected by application of appropriate scaling tests as described by 
Mather [1949] and Cavalli [1952]. (See also Table 2). In flowering 
time four out of the seven tests in Table 9 demonstrate interactions, 
but in height there is no suggestion of such effects. 
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The analysis of variation in height and flowering time 


The method used in the analysis of the variation, as revealed by the 
different second degree statistics, is that described by Mather [1949]. 


Height. The components of variation D, H, FE, and E, are estimated 
by least squares from the twelve second degree statistics shown in 
Table 10. The estimated values are given in Table 11 and the result 
of the analysis of variance, based on differences between observed and 
expected values, in Table 12. 

The analysis does not present any difficulty in interpretation. Fit- 
ting the four main components accommodates the major portion of 
variation. The item for residual disturbances which contains other 
sources of variation, i.e. linkage and interactions, is not significantly 
greater than the basic error, indicating that any further analysis is 
unjustified. 

The results will be discussed further below. 


Flowering Time. After fitting the four main components, the 8.8. for 
deviations is still significantly higher than the basie error. This indi- 
cates the presence of linkage or interactions, or of both effects. 

Linkage is accommodated by fittmg D and H components according 
to their linkage terms (Table 4). As V2 ror, is unique in respect of 
linkage, this statistic is allowed to take its own value in the estimation, 
leaving two D-components (D, and D,) and three H-components 
(H,, H, and H;). The result of the analysis is given in Table 13. The 
significant reduction in the residual $8.8. obtained in this way might be 
taken as suggesting the presence of linkage effects. However, since 
the significant remainder item indicating interaction effects is of the 
same order as the linkage item, linkage cannot be inferred on this 
evidence alone (see Table 7). 

Under these conditions the next step in the proposed sequential 
analysis is to allow specifically for interaction effects. The composition 
of the statistics available, in terms of additive, dominance, interaction - 
and environmental effects is shown in Table 6. It will be observed that 
the range of statistics does not allow the complete solution with the 
nine extra parameters required for the full specification of all digenic 
interaction effects. Hence an analysis allowing for interactions in a 
simplified form has been undertaken. This simplified form of analysis” 
is designed to include duplicate and complementary types of interaction 
as defined in the section on separation of linkage and interaction effects. 
Where alternative signs appear according to distribution of genes, 
the association form is used. 

The result of the analysis of variance is recorded in Table 13. As 


a 
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TABLE 10 
OBSERVED VALUES FOR THE STATISTICS 


Height Flowering time 
Statistic 
Block Block 
a ~ maa II if II 
Vireo 30.23 30.22 40.98 59.59 
Wirors W728. 16.47 PPA TES 23.96 
Virs 29.06 29.73 sy abe eval 67.92 
Viropi + Virope 30.52 34.21 50.13 49.96 
Wirop: + Wirop2 TARS9 9.96 2outo 23 .00 
Virori 17.96 eZ 24.27 27.41 
Wirori 9.71 dee 11.95 14.15 
Vors 22.98 20.70 48.62 49.72 
Vorop1 + Voreps _ 38.50 43.06 eae 74.85 
Vorori 31.88 26.19 55.18 69.14 
E, 11.97 13.49 iff 2B 22.77 
EF, 6.81 Beh 2626 12.03 9.81 
TABLE 11 


EstimaTEs OF COMPONENTS OF VARIATION IN HEIGHT __ 


Do | 6-282 2701 
H |. 18:24 4.6.66 —~ 
EH .-| 14.06 + 0.97 


| 11.43 £0.74 


429 


INTERACTIONS AND LINKAGE 


600 — g0'0 
LODO SS LORO, 
T00'0> 
d 


’ \ ; 
A) 
‘sosATeUe T}Oq UI sur9y IOII9 JO AJIoUSFOWM0Y SMOYS 480} 8,490]9.1R4 
\ : 2 i 


| | 

GE°GE ras soyeordng | 
Si Git g Yoolq X yenpisey 
91° GP € yoorq X “gows94UT | 
89 SP hig “yoorq X ‘dwog | 
£9 GST ceed  \ ASIP TS pee TOn0 = COLON 
G3 €6T ic suoroe1s}UT | 10°0 — 600 
€%° SOSIT P Byuanodur09 UIBIT 

VG { 1®}OL 


sh a | way 


SUOTPOBIOJUL LOF SUIMOTTY 


\ (esAyeuy oatenpoxg) 
GNI, DNIUAMOT NI GONVINVA dO SISXIVN’ 


el WTaVL \ 


. 


- | op reneged 


430 BIOMETRICS, DECEMBER 1956 


in the previous case, the analysis is based on the deviations of observed 
values from expected for the twelve second degree statistics available, 
the expected values being calculated from the exclusive components 
(Table 14). It can be seen that the disturbances remaining after evalu- 


TABLE 14 


Estimates or Excuustve CoMPoNENTS OF VARIATION IN FLOWERING TIME 
(ExcLustvE ESTIMATION) 


Allowing for linkage Allowing for interactions 
Dy, 57.68 + 6.29 D 16.89 + 13.25 
De 134.61 + 30.58 D, —22.86 + 35.15 
Hy, 4.66 + 25.67 H —28.28 + 49.56 
HA —36.67 + 62.16 A, 125.62 + 42.73 
HA; —22.18 + 24.78 I 59.67 + 22.92 
EB, 20.13 + 3.70 E, 22.26 + 3.06 
E, 20.58 + 2.53 E 15.85 + 2.48 
IDG Sa 


Definitions of D and H components are shown in Tables 4 and 6. 


ating the effects of interaction are still significant. The results are 
discussed in the next section. 


DISCUSSION 


The height data from this experiment show a relatively large and 
highly significant additive component, indicating that although the 
phenotypic difference between the parents is barely significant, the 
genotypes are by no means alike. Furthermore, a rather strong sug- 
gestion of heterosis in /, combined with the absence of non-additive 
gene effects supports the view that genes affecting the character under 
consideration are present dispersed among the homozygous parents. 
The significant dominance component H and the dominance expression 
VH /D = 0.74 suggests a relatively high degree of dominance, but 
not overdominance. 


Mather and Vines [1952] in an earlier experiment involving different 


_ generations from the same cross found evidence of non-allelic inter- 


actions in the height data, the experimental material being at this time 
grown at Merton, South London. Their experiment was continued, 
however, at Birmingham, where no suggestion of genic interactions 
could be detected in the same material, which is in accordance with the 
present results (also obtained at Birmingham). This apparent change 
in the properties of individual genes has been discussed by Breese [1954]. 

The presence of residual disturbances affects in general the absolute 
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and relative magnitude of D and H as estimated, and thus also the 
degree of dominance. A comparison of dominance relationship can, 
however, be made where the effects of interactions are removed. Diallel 
data which included P, and other varieties provide such an estimate of 
V H/D which indicates complete dominance (Jinks [1956]). This is 
in satisfactory agreement with the present result. 


Flowering time. The indication by the scaling tests of the presence of 
interactions is confirmed by the analyses of variance where allowance 
is made for linkage and interactions respectively, in the former analysis 
by a significant residual mean square and in the latter by significant 
interaction components. 

Less confidence can, however, be placed on the suggestions of 
linkage effects in these analyses. In the first analysis, the linkage item 
is not significantly greater than the residual mean square and so cannot 
be regarded as’anything more than a reflection of interaction effects 
(see also Mather and Vines [1952]). In the second analysis, since only 
a simplified form of interaction is considered, the significant remainder 
item still cannot be related to linkage any more than to unaccommo- 
dated interaction effects. Furthermore, allowing for interactions in the 
form specified reduces the residual disturbances item by a greater 
amount than does allowing for linkage. Once again the presence of 
interactions rather than linkage effects is suggested, although the 
latter cannot be entirely ruled out. 

As the presence of significant residual disturbances affects the 
size of the additive and dominance components little weight can be 
placed upon the estimated values in Table 14. The insignificance of 
D (= > d2) is probably due to large error whereas the apparent dis- 
agreement between H, (= >> h,l) and H (= >> h2) emphasises the 
inadequacy of the underlying assumptions. 

The negative value of D, (= >. d,j), although not significant, 
suggests the presence of a duplicate type of interaction. The presence 
of interactions in flowering time in Nicotiana rustica, probably ofa 
duplicate type, is also found in data from diallel crosses (Jinks, [1954], 
Breese, unpubl.). Mather and Vines [1952] did not find similar effects, 
but their data showed a significant mean square for linkage. Different 
manifestations of genes in changing environments may, however, well 
be the reason why data disagree on this point (Breese, [1954)). 


SUMMARY 


Following the methods of Mather [1949a,b] and Hayman and 
Mather [1955] formulations including additive, dominance, linkage and 
interaction terms have been obtained for first and second degree statistics 
deriving from a backcross of an F, family to its 7, and parents. ‘The 


= 


432 BIOMETRICS, DECEMBER, 1956 


experimental recognition of each effect is discussed and the conditions 
necessary for the separation of interaction and linkage considered. A 
procedure which enables the discrimination of these latter effects is 
outlined. 

An experiment on Nicotiana rustica set up to illustrate the methods 
has been analysed for the two characters measured (height and flowering 
time). 

In height there is no suggestion of heritable effects other than addi- 
tivity and dominance. On the other hand, analyses involving both 
first and second degree statistics strongly suggest the presence of non- 
allelic interactions in flowering time. Since the number of available 
statistics is not sufficient for the simultaneous assessment of all digenic 
interactions and linkage effects, complete separation of these effects 
is not obtained and the detection of linkage is subject to uncertainty. 
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APPLICATIONS OF THE k, STATISTIC TO GENETIC 
VARIANCE COMPONENT ANALYSES* 
D. 8. Rosson 
Cornell University, Ithaca, N. Y., U.S.A. 
1. INTRODUCTION AND SUMMARY 


The distinguishing property of the k, statistic is that it estimates the 
fourth population cumulant. If {Y;},7 = 1, --+ , n isa sample of n 
independent observations on the chance variable Y then the statistic 


n 


(n — 1)(n — 2)(n — 3) {a+ 1) L(Y. =i) 
3m—1)|< bate 
~ =D) Sv, - a |} 


k.(Y;) = 


has the expectation 
Ek(Y,) = E(Y — EY)* — 3[E(Y — EY)’) = K,ly) 


This paper will present two essentially distinct applications of this 
property of the k, statistic to genetic variance component analysis, 
(i) in the unbiased estimation of the variance of estimates of variance 
components and (ii) in the estimation of the number of genes or factors 
controlling the inheritance of a quantitative character under the addi- 
tive model with dominance. 


2. UNBIASED ESTIMATION OF THE 
VARIANCE OF ESTIMATES OF VARIANCE COMPONENTS 
The fact that k, may be used to construct an unbiased estimate of 
the variance of a sample variance is well known and follows from the 
fact that the statistic 


; dX (Y, — 9)’ 
7 a ae 
has variance 
2 2 2 il 
(1) UG ede can Gla) te K,(y). 


*Paper No. 332 from the Department of Plant Breeding, Cornell University. 
433 . 


434 BIOMETRICS, DECEMBER, 1956 


So the statistic 


(2) V(sy,) ks( Ys) 


Sd eee oe 
has the expectation 

EV(sy,) = V(sy,) 
This result is exact and holds for any probability distribution, the only 
assumption being that Y, , Y., --- , Y, are independent and identically 
distributed. We shall now indicate the extension of this result to the 


estimation of the variance of variance component estimates for a nested 
classification typical of the inbreeding experiment. 


2.1 The one-way classification in general 


The statistical model upon which our genetic variance component. 


analysis is based is the following. A collection (1, , m2, --: , tm) of 
probability distributions 7,(z) of the chance variable X are mixed 
according to the probability distribution p = (p, , po , -** , Par) to 
form the mixed distribution 7z, , 


T(x) = > Dit (X) 


A sample of mn observations from this mixed population is taken 
according to the following sampling plan: m populations are independ- 


ently selected according to the distribution p and from each selected. 


population n independent observations on X are taken. This. model 
describes the situation, for example, where m F, individuals are selected 
without regard to phenotype and then selfed to obtain n F, progeny — 
from each ee ; the chance variable X in gous represents 
ph ype measured in the F, generation. + pit 


N 
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where 
M 
=3 2 
LEP a 3 Pron 
a 


M 
= >» Diu, — pw)” 


With this notation the variance component analysis based on the mn ob- 
servations X;; ,7 = 1, --- ,m,7 = 1, --- , n, may be written as in 
Table 1. 

r TABLE 1 

VARIANCE CoMPONENT ANALYSIS FOR THE ONE-WAY CLASSIFICATION 


Oe SN ae me 


Source of variation Jet M.s. Mean square expectation 
Between populations ° me — My, - 3 + nor ; 
Within populations m(n — 1) © My o 

rue ; 2 Oe M p= MM, 
Variance component estimates: SOS M,, = 


The variance component estimates listed in Table 1 are obviously 
unbiased and their variances are easily computed. 
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where 
Wa = > ( z ce 
Be) ea eae = 4h 
_ ede 
: ; = ee 1 Blow) + 7 Ks 
and 


M 
Vo) = Dd plo — oY 
h=1 
An obvious unbiased estimate of V(5’) is given by 


(3) V@) =o 


a=1 


The variance of the estimate i is complicated by the fact that, in 
general, the mean squares M, and M,, are not independent. Thus, 


V@) = y(e— Me = Me) 


2 < {V(M,) + V(M.) — 2Cov (M, , M.)}. 


A 


_* The mean square M, may be written 
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An unbiased estimate of V(M,) is given by applying equation (2) to the 
sample means @, , f,, --+ , &,; ; thus, 


on 2 M,\’ mm 
VM) =n \— +1 ( fs) a m(m = 1) Iakee »}. 


The covariance of the mean squares 7, and M,, can be shown to be 


Cov (M, 5) M.,) = Ss {cov [Cu ea u)’, a; 


2 3 1 2 Tr 
= 7 OOV [(@: — nw), BCX — Bre) eat i V(o;) +k 


and an unbiased estimate of this covariance is 


™m 


Cov (M, , M,) = CRSELEN de sil@ — 


An unbiased estimate of V(sz) is therefore 
(4) Vist) = (VM) + VG) — 2 Cov (My , M.)]/n’. 


It should also be noted that an unbiased estimate of the variance among 
the “within population” variances, V(c;), which is of interest in some 
genetic studies, is available in the form 


Pe) = Pe) -2 | Ba HD 


2.2 The One-Way Classification in Genetics 


In the genetic situation where X measures phenotype the variance 
of X is sometimes further partitioned into genetic and environmental 
components. A third component of variance could be introduced into 
the general one-way classification simply by introducing a mixture q 
of populations of the form z, which, in the genetic model, could be 
paralleled by a third sampling stage in which each of the mn F’; organisms 
is replicated r times. Such a scheme, however, would require that the 
organisms reproduce asexually in order to obtain r replicates of any 
given individual. We shall assume, instead, that the scale on which 
the phenotype X is measured is so chosen as to make environmental 
effects independent of genotype. In this case we may, in addition, 
suppose that the n individuals of each family appear together in an 
experimental unit or plot, since plot effects, which are environmental, 
are then independent of family effects. In an actual experiment this 
assumption of independence of environmental and genetic effects would 
be tested by including in the experiment two or more replicates, or 


> (@ — 9). 


aoe 
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plots, of several genetically different non-segregating generations such 
as P, , P, , and F,. If r plots of a nonsegregating generation and m 
plots consisting of m independently selected F; families appear in an 
experiment then the variance component analysis takes the form given 
in Table 2. 


TABLE 2 
VARIANCE COMPONENT ANALYSIS FOR THE GENETIC ONE-WAY CLASSIFICATION 
Source of variation D.f. M.s. |Mean square expectation 
Between F’; families m—1| Mf a +2, + no; + 22) 
Within F’; families min—1)) MY +a; 
Between reps of nonsegr. generation fowl M, a2 + no2 
Within reps of nonsegr. generation rin — 1)| M, a 


Variance component estimates: 


eS 2M. 2-4%=—™. 
nv 


_ (Mj — MJ) — (M, — M,) 
n 


3=M;S-M,, 8s 


The average environmental variance within plots is denoted by ¢ and 
the environmental variance among plots (totals of mn items) is then 
denoted by no, + na,. The component é; denotes the average within 
family genetic variance and no; + na, denotes the genetic variance 
among family totals based on n independently selected progeny. 

Unbiased estimates of the variance of the estimates given in Table 2 
may be constructed by essentially the same argument used before, 
since the mean squares (M, , M.) are independent of (Mj , M/Z). For 
example, 


V@&) = Vt.) + VO.) 
where V(M,,) and V(M,) may each be estimated by (3), and 


Vie) = 3 [VMS — MD + VM, — MO) 


where V(Mj — M.) and V(M, — M,) may each be estimated by (4). 


2.3 Extension to a Nested Classification 


The extension of the general one-way classification variance com- 


a eb he 
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ponent analysis to a general hierarchal classification may be indicated 
by the introduction of a third level to the model underlying Table 1. 
As mentioned earlier, this may be accomplished by introducing a mix- 
ture q = (q%: , G2, **: , Ye) of populations of the form of 7, ; i.e., of 
populations which are themselves constructed by mixing a group of 
populations. If the sampling plan for this model consists of (i) the 
independent selection of r groups, (ii) the independent selection of m 
populations from each of the r selected groups, and (iii) n independent 
observations on X from each of the rm selected populations, then the 
variance component analysis may be written as in Table 3. 


TABLE 3 
VARIANCE COMPONENT ANALYSIS FOR THE 3-LEVEL NestTEeD CLASSIFICATION 


Source D.f. M.s. |Mean square expectation 


Between groups of populations r—1 M, a? + ns2 + mnoz 
Between populations within groups [r(m—1)| M, a 
Within populations within groups rm(n—1)} My a 


Variance component estimates: 


The variance components listed in Table 3 are defined by 


Mi 


R 
a = > qi De Pda 


7 


R Mi Mi 2 
a. = x qi pas pas( ma x De pasus) 
M; 


R Mi R 2 
on Ds a( > Diihig — pe qi we Dasuss) 


where y;; is the mean of population 7;; , o;; is the variance in population 
Tis Di = (Dir, +++» Pim;) is the mixing distribution defined on the set 
(w:1, *** » Tia,) to form the mixed population Pet p:;0i; and 
gq = (4, °°: 5 dx) is the mixing distribution defined on the set 
(oi pum: eet ad ae PrjTr;)- 

Unbiased estimates of the variance of the estimates given in Table 3 
may be constructed by extending the arguments which led to the un- 


I 


a 
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biased estimates (3) and (4). An extension of the argument leading 
to (3) is obtained by noting that the statistics 


MD, OG ae i)” 


n—l 


i ’ (ee el 2 ay Re Ge es See 
satisfy a one-way classification model entirely similar to that of Table 1, 
so that an analysis of variance of the s;; (Table 4) yields an unbiased 
estimate of the variance of &”. 


TABLE 4 


VARIANCE ComponeNT ANALYSIS OF THE ‘WITHIN POPULATION” 
SAMPLE VARIANCES 


Source ID se M.s. |Mean square expectation 
Among groups r—1 S, V(s7;) + mV (a?) 
Within groups r(m —1)| Sp Visi;) 
. ° — S. +; 2 eye Sad S; 
Variance component estimates: Visi) = SS, Vie) = 
m 


The interpretation of the variance components listed in Table 4 is as 
follows. 


V(si,) 


V(o3,) 24 V(si) 
R Ps - R Mi 9 il 
» qi 2 Piiloi; — 53)" + Diy qi s pal =i oii + 1 k,,,) 


n 


Vai) = > qi(a; — 3°)? 


According to this notation, then, V(M,,) = V(s?;) + V(6?), so that an 
unbiased estimate of V(5’) may be computed as 


VG) = 8, + 2S, 


An unbiased estimate of V(M,) may be obtained directly from (3) as 


me r s (@,, — &,)° = Ss (Sipas-ee,) : 
Md 1 j=l PS errata tie : 
VM.) = r(r — 1) > m— 1 es r(m — 1) 
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Likewise, an unbiased estimate of V(M,) may be obtained directly 
from (2), 


7 Poe TLS 2 M, 7 tf 
VM em ' +1 (%:) if caer + 5 ke »}. 


If M,, , M,,, denote the between and within population mean squares 
respectively for group z then M, = (1/r) pa M,, and M5 = (1/14 
M.,,, , so that the covariance of M, and M,, is simply (1/r) Cov (Mj, , 
M..) and is estimated without bias by 


Gov (M, , M.) = Taney (> ee eo MyM). 
1 


rr— 1 


The covariance of M, and M, may be estimated without bias by 


Gov (M, , M,) a rx Der eae) > Mal ( — £)* — ; > (Ga | 


since the z;; of the present model satisfy the independence requirements 
of the X,; of the earlier one-way classification. 


3. ESTIMATION OF GENE NUMBER 


An aid to the interpretation of genetic variance components is 
provided by the various genetic models that have been proposed for 
describing quantitative inheritance. We shall consider the additive 
model with dominance as described by Fisher et al. [1932] and later 
expanded upon by Mather [1949]. According to this model.there is 
independence of gene action between loci but not between the two 
alleles at the same locus. The difference in effect between the two 
homozygous genotypes at the 7th locus is denoted by 2d; , and the 
deviation of the effect of the heterozygous genotype from the mean 
effect of the two homozygotes is denoted by h;. If P, and P, are two 
pure (homozygous) lines whose genotypes aan respect to the given 
quantitative character differ at N loci then if P, , Pz denote the pheno- 
typic means of P, and P, , respectively, we may write 


= N 
3(P, — P) = Dy bids 


where 6; is +1 depending upon which of the two homozygous allelic 
pairs appears at the ith locus in the pure line P, . Similarly, if i 
denotes the mean phenotype in the F, population obtained from the 


cross P, X P, then 
== N 
ee 503 + P,) = Da hs 5 
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and unless further complications arise 

F,.= P14 P.) +4 Dh 

By = 4(Py +P) = 4 D7 tat sek 

B, =4@, + P) +4 2) td + 4 Dh 
and so on for the various derived generations. Furthermore, in the 
absence of linkage the various genetic moments of interest may be 


expressed as linear functions of the parameters >. d:, >, hi , >> 6,d;h; ; 
De as, SS, h; , >, d:h; . For example, in the F, generation 


a =P Dd Pe 
Gpr a ao ey PD 
in the first backcross generation B, = F, X P, 
c=} Dati DW+8 D aah 
Ks, = —§ D(A + 20", 
while in the B, generation 
Ei Pos cap aegis 9g ie 
K., = —4 Do (h — 6d)’. 
In the F, generation we would have 
VP +e DM 
V+ UW 
ey oh Le at, Ot 
Vici ted bake, ht tes DL EW. 

If the above interpretation is given to the various genetic moments 
then it is clear that a genetic experiment may be designed which, when 
analyzed by the methods of Section 2, will yield estimates of all the 
parameters described above for the additive model with dominance. 


It is then possible to estimate a lower bound on the number 2N of 
segregating alleles. 


Mather [1949] discusses several methods of estimating N which 
are based on the general inequality >>7 y; — (1/N) Onn v2)" 0, or 
N > (1 y)’?/>1 yy’. For example, he suggests the estimates 


ae oe 
Nn 2203 bd)” Orn: 
i eae 


| 


>> ddh 


il 


qQ 
L3 
Il 
Nie BIR 


or 


a 
Dd: has 


VARIANCE COMPONENT ANALYSES 443 


or, as a second alternative, 


a 
where? >) d?+3)> -h’e POTS the average genetic variance &, within 
= 


os 
F; families and #; >> d* + & a h* + 7; >> dh’ estimates the variance 
V(o;,) among the within F; family variances. Estimates similar to 
N, and N, could be computed from other generations as well; in each 
case the quantity being directly estimated has the general form 


By (a6;d; + ony | 


aS 
2D (aid, + bh;)’ 


N 

2 ee 
N . y 
ey (ad; + bhi)’ 


where a and 6 are given constants, and the lower bound property is 
based on the inequality 7% y? — 1/N(X3% ys)? > 0. 

Lower bounds superior to N, and N, , respectively, may be con- 
structed by utilizing the general inequality 
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which may be shown to reduce to the inequality 
(a> @? Doh DS 8d DS dh) 00-8 Do = DD rea) 0: 
Likewise, then, letting x; = di , y; = h; we have 
Fete SAS vr + (DY #) 2 hi—2 Le Rs 
oss id Dah (edb) 
SEN. 
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NOTE ON WALD’S METHOD OF FITTING A STRAIGHT LINE 
WHEN BOTH VARIABLES ARE SUBJECT TO ERROR 


KE. S. Keeprne 
University of Alberta, Edmonton, Canada 


1. Introduction 


If x and y are correlated stochastic variables measured on individuals 
forming a random sample, the ordinary regression equation Y = a + be 
is the best linear predicting equation for y, given x. The other regression 
equation X = a’ + Db’ y is the best linear predicting equation for z, 
given y. If certain x values are selected by the experimenter, the first 
equation must be used not only for predicting y given zx, but also (in 
inverted form) for.predicting x, given y. Neither equation, however, 
gives as a rule the best structural relationship between x and y. A. Wald 
[1940] has described one simple and useful method for calculating such 
a relationship and for estimating the error variances of x and y. It 
has been noticed in a text-book exercise (Kenney and Keeping [1954]) 
that these variance estimates may turn out to be negative, and it is of 
some interest to consider why. 


2. The Wald method 
Wald’s assumptions are: 


(i) TS a XxX; ae Cray Ge Y; re Bah aid = i 2, aes , NV, where X;, , Y; 
are the expected values, and e; , 7; the errors, for x; and y; respectively; 
(ii) the e; are uncorrelated and have a finite variance o, ; the 7; are 
uncorrelated and have a finite variance o; ; the e; and 7; are uncorrelated; 
(iii) Y; = a + BX; ; 

(iv) N is even (= 2m), for simplicity; 

(v) the e; are small enough so that the ordering of the observations 
according to increasing x; and according to increasing X, are not 
substantially different. 

If the observations are ordered so that 


Dp etl. Series Se, 
consistent estimates of a and @ are given by 
6 = @ — 9,)/G@ — 4), (1) 
(8=49-— fi, (2) 
445 
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where 


m 


#, = >), 2,/m, =, = x3 x, /m, & = (z, + #.)/2, (3) 


1 m+1 


and similarly for the y’s. The points (@, , 9.) and (#, , J.) on a scatter 
diagram of observations are the centres of gravity of two groups of 
dots, each comprising half the observations. The line of best fit joins 
these two points. 

Consistent estimates of oz and o; are given by 


é2.= N(sz — s,,/B)/(N — 1) (4) 
6; = N(s; — Bs.,)/(N — 1) (3) 


where s; , 8, , Sz, are respectively the variances of x and y and — co- 
variance. 


3. Numerical example 


In the text-book exercise referred to, where x and y have the follow- 
ing values: 


 & 40 41 43 45 47 50 68 77 80 90 100 100 


y 65 60 43 63 85 60 48 56 74 53 91 98 


we find é, = 264 but 6% < 0. 
In the ordinary regression equations the slopes b, b’, are given by 


b=s,,/8,  =s,,/s © 
so that 
(N — 1)6t = Nsi(1 — 0/6), 
(N — 1)é% = Ns3(1 — 0’). 
The conditions for useful estimates are therefore 
Onrb <6 <1 /b’ or .1/b!< Bethe) (7) 
which mean that the Wald line should lie between the ae regression 
lines. This is not true in the above example. 
As Wald has pointed out, alternative estimates of o% and co? with 
N — 2 instead of N — 1 dhugtes of freedom, are obtainable from the 
variances and covariances calculated from the two groups of observa- 


tions taken separately. These estimates have the advantage of being 


eae (when ¢ and » are normal) independently of the estimated 
slope 
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They are given by 
(N — 2)62 = N(s!? — s!,/8), (8) 
(N — 2)é, = N(si’ = s!,8), (9) 


where 


m N 
Ns? = > (2; — £) =e by (a-— ha) 
1 m+1 
and similarly for the others. By analysing the variances and covariances 
into portions within groups and between groups it is easy to see that 
these estimates are just those which we should naturally make, using 
mean squares and mean product within groups. 


Variance of x So ets M. s. 
Within groups’ =-_- Ns?” N-2 92 
Between groups (N/4)(@ — #,)’ 1 5167 
Total Ns: N-1 aH 
Variance of y Soe ‘Dia F M.s. 
Within groups Ns?” N—-2 311 
Between groups (N/4)(j2 — 9)” 1 161 
Total Ns; N-1 298 
Covariance SPE a D. f. M. p. 
Within groups Nst, N —2 119 
Between groups (N/4)(@ — @)(G2 — ji) 1 913 
Total Nsw, Nish 191 


The numerical values given in the last column are for the example 
cited above. They yield ¢; = 291, 6 < 0. 


4. Discussion 


The quantities b, b’ and @ are all ratios of a mean product to a mean 
square, (total cov.)/(total var. x), (total cov.)/(total var. y) and (““‘be- 
tween” cov.)/(“between” var. x) respectively. The failure of the 
Wald estimate for ¢2 is therefore seen to be due to the small covariance 
between groups, compared with the corresponding variance for z, 
which in turn is due to the small “‘between”’ variance of y (the covariance 
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being the geometric mean of the « and y variances). ‘The 90% confidence 
limits for 8, calculated by Wald’s method, in fact include zero. 

An estimate of ¢2 , though probably too large, can be obtained from 
the “variance of estimate’ about the x on y regression line. This 
estimate is Ns:(1 — r’)/(N — 2) = 474. If the distribution of the 
true values X, is known, an estimate can be calculated from the relation 


s = sy + dN — 1)/N. (10) 


Thus if we suppose that the true values X; are uniformly spaced at an 
interval h, the variance sx is h’(4m” — 1)/12. If in the example above 
we take h = 5, X; ranging from 37.5 to 92.5, the estimate obtained is 
Gea 228. 

It is clear that in this example the fifth of Wald’s assumptions is 
not fulfilled, the e; being much too large. The simplicity of the method 
may easily lead to its use in situations where this use is not warranted. 
In such situations, however, it is a waste of time, except as a mere 
exercise, to attempt to fit any line at all, and little confidence could 
be placed in any prediction based on such a line. 
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CORRECTION TO “THE USE OF MITSCHERLICH’S REGRES- 
SION LAW IN THE ANALYSIS OF EXPERIMENTS WITH 
FERTILIZERS” 

FREDERICO PIMENTEL GOMES 
University of Sao Paulo, Brazil 


Mr. I. R. Nogueira (Escola Superior de Agricultura ‘Luiz de 
Queiroz’, University of 8. Paulo) and Mr. V. N. Murty (Central 
Tobacco Research Institute, Rajahmundry, India) have independently 
pointed out that the formulae given for F,, on page 508 and for F,, on 
page 509 of this journal, Vol. 9 (1953) should have specified 


Fy, = (2 — 4r + 8r)(1 — n)* 
F,, = Q@)( — 1)°(8 + 4r + Gr? + 47° + 3r’) 


The author wishes to note also that the last equation on page 505, 
and the first on page 506, should be respectively 


AY 41 — A,+.cH ,(1/log Ail meee dx 
dd 541 — A;.,wcH ;(1/log Ale Aa dz. 


INTERNATIONAL BIOMETRIC SYMPOSIUM ON “THE ROLE 
OF BIOMETRIC TECHNIQUES IN BIOLOGICAL RESEARCH” 


TECHNICAL PROCEEDINGS* 


Instituto de Educacao Carlos Gomes, Campinas, Brazil, July 4-9, 1955 


RECENT ADVANCES IN BIOMETRY IN JAPAN 


Epitep py M. Masuyama 
Institute of Physical Therapy and Internal Medicine, Tokyo University 
AND 
M. HaTamura 


Chief of Section of Physics & Statistics, National Institute of Agricultural Sciences; 
National Secretary, Japan, Biometric Society 


We summarize here the present status in Japan of recent studies of 
biostatistical techniques, and of their applications. 


BOOKS AND PERIODICALS 


Several books for statisticians, biometricians and various types 
of agricultural technicians have been published recently. For example: 
ys Kitagawa and M. Mitome [1953], Tables for the design of factorial 
experiments,** which is the first extensive collection of various designs 
thus far published and unpublished; 1) T. Kitagawa and M. Masu- 
yama [1952], Statistical tables (revised), nearly one quarter of which 
were computed by Japanese statisticians; ii) T. Torii, K. Takahashi 
and I. Dohi [1954], Statistical methods in biological and medical research; 
iv) T. Nakayama [1954], Design of field experiments; v) M. Masu- 
yama [1952-4], Popular lectures on experimental designs: Part I, Estima- 
tion, Part II, Testing hypotheses, Part III, Planning and designs; vi) T. 
Yamamoto [1954], Sampling method in fishery catch statistics; vu) K. 
Kinashi [1954], Statistical method in timber survey, and viii) K. Mat- 
sushita and C. Hayashi [1955], Sampling method in forestry and its 
applications. 

Since 1953, the Journal of Biostatistics has been issued quarterly 
by R. Kawakami. Mainly reports on biometrical surveys are contained. 
Also, in autumn 1953, Y. Kondo, M. Hatamura, Y. Tumura et al. 


*Publication financially aided by the International Union of Biological Sciences. 
*kDetailed bibliographic citations are listed under ‘‘References.” 
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started publishing a quarterly journal Research Memoirs of Agricultural 
Statistical Society whose contents are articles, reports and news on the 
topics of crop and agricultural economic sampling surveys, and the 
design of agricultural experiments. 


GENERAL THEORY 


The optimality of the orthogonal designs from the viewpoint of 
estimating parameters was proved by S. Moriguti [1954b]. In his 
article the necessary and sufficient conditions for Taguchi’s formulae 
were given. The sufficient conditions were first given by G. Taguchi 
in his ‘Notes on experimental designs.’ 

As an interpolation formula for Tang’s table of the error of the 
second kind in the analysis of variance, 8S. Miyagi showed empirically 
that the point (¢, 8) lies approximately on a line of the normal prob- 
ability paper for fixed f, , f.anda. Later M. Masuyama [1951] deduced 
his empirical formula, combining Geary’s approximation of the distri- 
bution of the ratio of two normal variates and a normal approximation 
of non-central chi-square distribution. This is the justification of the 
use of the square root paper for the non-central F-distribution. The 
exact distribution of this Geary’s statistic was given by M. Masuyama 
[1952]. After Yamauchi’s approach 8. Ura extended Lehmer’s table 
of the error of the second kind in the analysis of variance for a = 0.05 
and 8 = 0.10. He used the notation y which was defined by 


take get! 
y et: a co) 


where ¢ denoted Tang’s original notation. This modified notation 
was first used by M. Masuyama and it is more convenient than the 
original notation. 

S. Moriguti [1954a] deduces confidence limits of a variance component 
by Welch’s approach, which give fairly reasonable values even when 
fi: is small, provided that f, is sufficiently larger than f, . 

Instead of various tables of random sampling numbers ‘Random 
dice’ (a set of 3 icosahedrons) and ‘Randomer’ (a handy device for 


producing 3 random digits instantaneously) are fairly widely used. 


They are made in the Development Research Division, The Japanese 
Standards Association. 

K. Kunisawa et al. published a seven figure table of Kolmogorov’s 
distribution 


ioe) 


o@) = (1 


=—o 
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for x = 0.250(0.001)2.299. G. Taguchi [1952] gave the minimum value 
of h of the Polya-Eggenberger distribution 


k 
Ribas a < h(h + d) --+ (h+n — 1d) (+ aver 


n=0 n! 


for h/d = 0.5(0.5)15.0, 20.0, 30.0, 60.0, ©, & = 1(1)25 and 
F(k;h, d) = 0.95, 0.99 


Studies on various uses of the square root paper were published by 
M. Masuyama [1954c] in English in a booklet which contained many 
exercises in biometry and some industrial applications in non-central 
t- and non-central F-distributions. It was compiled for use in the 
Western Pacific Regional Seminar on Vital and Health Statistics, 
Weert). 

It is not new to use punched cards in the analysis of observed data. 
However, their use to find a required design seems to be new. Its 
theory was first given by C. R. Rao in 1951, but his method requires 
the use of minimum functions, which is not elementary and is impractic- 
able if the size of experiment is fairly large. To remove these difficulties 
M. Masuyama [1955e] used a simple method to produce a deck of 
cards which has the same function as Rao’s, except in the case of Youden 
squares. Various decks of mechanical and hand sort cards were pre- 
pared for almost all cases which are needed in laboratory or in factory. 


DESIGN OF AGRICULTURAL EXPERIMENTS 


Randomized blocks and split-plot design are common in various 
kinds of experiments; e.g. variety, seeding and fertilizer tests, or trials 
for the control of noxious insects or plant-diseases, which are performed 
at the agricultural experiment stations in each district or prefecture. 
Simple lattices, triple lattices and cubic lattices are occasionally used in 
variety yield tests of paddy-rice, wheat, soybean and maize. Lately, 
several groups of experiments conducted in cultivator’s fields have 
been summarized statistically, and some variety X place interactions 

were evaluated. 
. In the Bulletin of the National Institute of Agricultural Sciences, 
series A, M. Hatamura, T. Okuno and T. Sasaki published a lengthy 
article ‘On the design and analysis of field experiments’, in Part I of 
which the necessary techniques for the statistical analysis of an experi- 
ment are introduced. In Part II, the randomized block design is 
selected, the relation between the random assignment of treatments in 
each block and the assumptions underlying the normal regression theory 
is discussed, and the stochastic models and discussions of B. L. Welch, 
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E. J. G. Pitman, A. Wald and O. Kempthorne are generalized to some 
extent. In Part III, the most general theory for the incomplete block 
designs is deduced by the use of particular symbols. Analyses of the 
balanced lattice, the simple lattice, the triple lattice, the cubic lattice 
and the rectangular lattice are obtained as special cases. 

In the Research Memoirs of the Agricultural Statistical Society, 
T. Okuno and T. Sasaki reported “Studies on the stochastic models in 
analysis of covariance’. In this report, it is assumed that the inde- 
pendent variable x as well as the dependent variable is subject to 
treatment and block effects, and that the models for all variates are 
the same as in the above article. Under these assumptions exact tests 
of significance of treatment effects are introduced and discussed in 
detail. 


BIOASSAY 


Modern techniques of bioassay were intensively used by Y. Ito and 
his collaborators. A monograph, ‘Bioassay and stochastics’” was 
published in 1953 which was characterized by balanced introduction of 
biological and mathematical aspects of recent development of bioassay 
techniques. An empirical method of point-wise estimation of synergistic 
action was discussed by M. Masuyama [1955b,c] and his method was 
applied to study the joint action of two antibiotics, whose dose-response 
curves could not be represented by the logarithmico-normal distribution 
functions with the same standard deviation. 

On the other hand, in the field of insecticides, various studies were 
carried out by Kono, Kono & Utida, Nagasawa, Ohsawa & Nagasawa, 
Ohsawa and others. 

In 1954, Penicillium islandicum and other toxic fungi were found in 
imported rice, so that a Toxicological Research Group and Sampling 
Research Group were organized. As a member of the last group M. 
Masuyama [1955] wrote an article in which he suggested the use of 
composite samples in two cases; i) to screen the lots which contain 
toxic fungi, assuming the total number of lots is finite and the fraction 
of defective lots is fairly small, and ii) to estimate the percentage of 


__ infested grain by the method of “‘multi-grain culture per one Petri dish”’. 


POPULATION GENETICS 


After the second War, concepts of population genetics were intro- 
duced. M. Kimura, K. Sakai and E. Nakamori reported theoretical 
studies on the segregation of genes and the degree of homozygosis in 
continued self-fertilization. M. Nei reported mathematical studies on 
the hybrid behavior of partially allogamous plants. 

K. Sakai [1951] and K. Sakai et al. [1951] discussed the importance of 
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bulk-method in breeding. K. Sakai et al. [1953b] studied the decrease of 
hybrid vigor through the successive generations. Further studies on 
change of heritability in autogamous plants by K. Sakai [1954], estima- 
tion of combining ability by N. Nakamura [1953] and heritability in 
eggplant by K. Gotoh [1954] should be mentioned. Heritabilities in 
maize and in sheep were calculated by N. Nakamura et al. [1953] and 
by K. Matsumoto et al. [1954] respectively. Brief reports concerning 
heritability were published in swine by M. Taketomi et al. [1953, 1954], 
in White Leghorn fowl by Y. Onishi [1953], in tobacco by 8. Lyama 
[1953], and in soybeans by Y. Yoshino et al. [1953]. 

Results of bulk-method breeding were reported in rice by T. Naga- 
matsu et al. [1953] and in wheat by 8S. Sekizuka [1953]. Meanwhile, 
T. Hoshino [1953] and K. Sakai et al. [1953a] called attention to the 
interplant competition in rice, as did T. Yamada [1953, 1954] and 
T. Yamada et al. [1953a,b] in the case of barley, soybeans, red clover 
and wheat. In addition, K. Sakai [1953] estimated the magnitude of 
the competitional variance in mixed plant populations of wheat varieties. 
Thus, theoretical studies on bulk-method breeding are going to be 
reexamined on the basis of the interplant competition. 


ECOLOGY 


Research of Population Ecology, vol. 1 (1952) and vol. 2 (1953) 
contained many studies on the population ecology of various insects by 
the group of entomologists in Kyoto University. Many probabilistic 
investigations on inheritance have been performed by Y. Komatu 
and his collaborator since 1951. 

For sampling survey of vegetation, optimal size and number of 
square grids (sampling units) were studied by M. Numata [1949]. 
The number of square grids is closely related to the homogeneity of 
plant distribution. In this article, he defined a coefficient of homogeneity 
(h) under the assumption of normal distribution. 

There are three types of plant distributions; i.e. random, regular and 
aggregate distributions. The homogeneity of plant distribution means 
the non-aggregate distribution. By M. Numata [1950b], it is shown 
that such a homogeneity consists of individual homogeneity (h) and 
a communal one (floristic and vegetational). 

Several topics in the study of vegetation, viz. typological method, 
sampling method, plant distribution, and the size and number of square 
grids etc. were discussed by M. Numata [1950c]. 

M. Numata [1954b] defines the structure of a plant community as 
a wide concept, including quantitative composition and spatial distri- 
bution of species and life forms, dispersion of biological characters 
‘measured phytosociologically, and their dynamic variation. Among 
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these, the dispersive structure is the term widely used to denote plant 
distribution, and constitutes a large part of the quantitative structure 
treated from the view-point of homograde statistics. 

There are several methods of stratification of vegetation. The 
stratification by dominant species does not, in general, coincide with 
that by the physiognomy (general outward appearance) of vegetation. 
The latter i.e. ‘ecological stratification’ seems to be justified statistically 
(M. Numata and H. Nobuhara [1952)). 

Motomura in 1932 introduced ‘the law of geometrical progression 
of the population density’. The frequencies of individuals of each 
species in a sampled area in certain animal populations are approxi- 
mately arranged in a geometrical progression. M. Numata, H. Nobu- 
hara and K. Suzuki [1953] tried to give its theoretical basis and its 
practical interpretation in plant populations. 

The structure of a plankton population was studied statistically by 
T. Ito and M. Numata [1954], by means of an artificial population which 
was made of coloured sesame in a 5% solution of NaCl. 

Shinozaki and Urata in 1953 showed the close relationship among 
Motomura’s law, Corbet’s law of harmonic series, Fisher’s law of logarith- 
mic series, and Preston’s logarithmic curve. The observations in a 
beech forest at Mt. Daisan, Tottori Prefecture, Japan showed that 
there were three types of populations; i.e. the GP type for the geometrical 
series, non-GP type or S type which was derived from Shinozaki and 
Urata’s homogeneity concept, and intermediate type. The frequency 
of species in that beech forest was represented by the geometrical 
series only when the size of sample was small or the stratification 
according to life forms was introduced, and otherwise the S type popu- 
lation (H. Nobuhara and M. Numata [1954]). 

The change of distribution function in the population depending 
upon the size of grid was studied by M. Numata [1950d]. The distribu- 
tion function seems to be Poissonian for smaller grids, and binomial for 
larger grids. : 

Seasonal variation of the type of weed communities in rice-fields, 
farm lands, forest floors, and a wasteland was studied. Vegetation 
_ types were expressed quantitatively by means of the coefficients of 
biological types. The forms of seasonal variations of vegetation types 
represent biologically habitat conditions. These variations were 
analyzed by M. Numata [1953], from the viewpoint of time series. 


FORESTRY 


Plane integral geometry was applied to estimate the basal area and 
the perimeter of timber (and the number of trees in a stand, if necessary) 
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by M. Masuyama [1953a,b]. The same method is applicable to esti- 
mate the total area under certain crops. Under certain conditions 
integral geometrical method was proved to be more efficient than the 
ordinary areal sampling method. Instead of the fairly expensive 
Pendelrelaskop or Spiegelrelaskop, M. Senda and K. Maezawa [1955] 
made a simple instrument which had an angle-gauge with a slit of 
constant width with which the bias due to the slope of a stand was 
automatically removed. 


FISHERIES 


It was after World War II that statistical methods were intro- 
duced into surveys of fisheries, and that a considerable development in 
sampling method appeared. Since 1951, the fishery catch statistics in 
Japan have been prepared on the basis of sampling methods used on a 
nation-wide scale. Details have been described in Sampling method 
in fishery catch statistics [1954] by T. Yamamoto, and a general descrip- 
tion was issued as an Occasional Report by the regional office of Food and 
Agriculture Organization, Bangkok, in 1955. In these surveys, various 
sampling methods have been applied in accordance with actual situa- 
tions in commercial fisheries. Double sampling ratio estimate using 
the number of trips as an auxiliary variable may save labour and expense 
in enumeration. ; 

Meanwhile, biological surveys of commercial landings have been 
made, usually, by multistage systematic sampling (for instance, boat- 
container-individual fish). T. Doi [1948, 1949a,b], T. Doi et al., [1951], 
Y. Fukuda [1953], S. Kurita [1948], S. Tanaka [1953a,b,c, 1954a,b] and 
I. Yamanaka [1953, 1954a,b] each on different occasions dealt with 
methods of sampling, estimation and analysis, particularly for estima- 
tion of survival rate from age-composition. A study made by S. Tanaka 
[1955a,b] concerned the method for estimating the total abundance of 
fish eggs spawned. Under the guidance of T. Kitagawa, T. Yokota [1953] 
attempted to count the total number of fish shoals from the images of 
a fish finder. T. Doi [1950a,b, 1955a,b] and T. Kawakami [1952] exam- 
ined methods for prediction of future catch, analysing the auto-correla- 
tion of catch and the structure of the fish population (age-composition, 
mortality rate etc.). T. Yoshihara [1951, 1952] studied the growth 
of carp in a pond, fitting a logistic curve. The distribution of fish on 

the tuna long-line and salmon gill-net was discussed by T. Yoshihara 
[1954] and H. Maeda [1953] respectively. 

We cannot hope that this summary covers all biometrical projects 

in Japan during the last 5 years. Unintentional omissions will be 


reported at the next opportunity. 
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CONTROL OF ERRORS IN SURVEYS 


Morris H. HANSEN AND JOSEPH STEINBERG 
U. S. Bureau of the Census, Washington, D. C., U.S.A. 


1. INTRODUCTION 


In any process of data collection there are potential sources of error 
at every stage in the survey procedure. Errors may arise in defining 
the problem, in defining the universe to be studied, and in defining the 
concepts or establishing the measurement procedures (such as the 
question wording). They may originate in the sampling, i.e., in the 
specification of the units to be included and in the coverage of these 
units. In surveys conducted by field interview, they may arise from 
the complicated structure of the interview situation, in some measure 
stemming from the understanding, interest, motivation, knowledge, 
and skill of the interviewer and the respondent. Procedures for handling 
the data, such as coding and editing and tabulation, may also lead to 
errors in surveys. 

It is the purpose of good survey design to control the errors arising 
from these sources to an economic level. This level is reached when an 
increase in expenditures will not produce a worthwhile decrease in the 
risk of making wrong decisions from the survey results. Although we 
have not found a way to determine objectively this optimum level of 
control in connection with general purpose surveys in which a number 
of different statistics are produced and used for many different purposes, 
we do give the problems continuing attention. The problem of control 
is one of particular concern because errors arising from the various 
possible sources, especially those arising in the field collection of the 
data, sometimes are much larger than is commonly recognized. Often 
they can be controlled satisfactorily, if at all, only by explicit steps 
taken for such control. It is easy, on the other hand, after one becomes 
acquainted with the frequency and magnitude of individual errors, to 


~ be unduly pessimistic about the value of census or survey results. With 


reasonable procedures for control, and with the tendency for some of the 
types of errors to be more or less compensating, the net effects may be 
small enough so that the statistics will serve adequately many different 
purposes. In this setting, it becomes important to establish procedures 
for evaluating and for controlling the errors that may arise from various 
sources. We shall confine our discussions primarily to the errors arising 
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in the collection process, such as those involved in : attempting to comply 
with the sampling specifications and with other procedures in collecting 
the data. 

It should be emphasized at the beginning that it is one thing to 
specify a good survey design, and quite another to set in motion ad- 
ministrative procedures and controls which insure that the operations 
are carried through substantially as specified. It is to this problem 
that we wish to devote particular attention. 


2. SOME GENERAL APPROACHES TO 
CONTROLLING ERRORS IN SURVEYS 


Many types of controls have evolved from experience and are com- 
monly accepted as effective. The methods ordinarily used to control 
errors—careful selection of personnel, thorough training, and review 
and correction of work—are not always used effectively in the field 
collection phase. Often the review operations involve only the ex- 
amination of returned questionnaires for more or less obvious in- 
consistencies or failures to follow instructions, and perhaps a limited 
amount of reinterviewing or observation of interviewers at work. 

When we are confronted with evidence of unsatisfactory survey 
results, we tend to depend more heavily on such activities as more 
intensive training, or more supervision. Such methods may be effective. 
However, they should be evaluated in an effort to determine whether 
they yield the desired results and whether the additional expenditures 
entailed are commensurate with the added accuracy obtained. It is 
important to recognize that some of these methods of control in censuses 
and surveys may be far from sufficient and that evaluation of each 
phase is desirable. It is unfortunate that comparatively few attempts 
have been made to evaluate such methods objectively. While there 
have been some efforts along these lines in the U. 8. Bureau of the 
Census, the problems are difficult and the results have not been satis- 
factory. We believe that much additional work is needed in the objec- 
tive evaluation of such techniques. 


Controls in the U. S. Current Population Survey. 


We shall consider the problem of control primarily by beste iate 
and discussing some of the methods followed as well as some of the 
results of applying these methods to our Current Population Survey. 
(This is the monthly population sample survey taken by the U. 8. 
Bureau of the Census for estimating labor force characteristics, and, 
from time to time, information on various other topics.) In this survey, 
field interviews are taken from a sample of about 21,000 households 
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each month.* The interviewing is done within about a week in each 
month, and the results are compiled and published within about two 
weeks after the end of the enumeration period. Approximately 350 
part-time interviewers work under the direction of about 60 full-time 
supervisors and assistant supervisors in 33 field offices.** The sample 
is spread over 230 primary sampling units each consisting of one or two 
(or occasionally more) adjoining counties. One or a few interviewers 
work in each primary sampling unit on a part-time basis. Most of 
them are housewives, providing a labor supply of relatively high educa- 
tional status and ability, interested and effective in field interviewing. 


General procedures for selection, training, and supervision in the Current 
Population Survey. 


Of course, we attempt initially to select enumerators who are 
qualified to do the kind of work that is involved in our surveys. As an 
aid in selection a series of tests have been designed to determine whether 
enumerators can understand the concepts, read maps, and follow the 
ordinary instructions which they are required to observe in conducting 
the surveys. 

After enumerators have been selected they undergo a training period. 
At the present time new interviewers receive three days of instruction 
during their first month of employment. The initial training consists 
of. two days in the area in which the interviewer is to work; it covers 
general indoctrination and lectures on the two major operations—listing 
and interviewing—and it includes supervised field practice by the 
enumerator. Other matters receiving considerable attention are mock 
interviews, administrative matters, and work assignments. This first 
two-day period is divided into a number of training sessions of varying 
length. A training guide is provided to the supervisor indicating how 
each period is to be utilized. 

After the initial training the interviewer is given written assign- 
ments and review exercises to be completed at home and mailed to the 
supervisor. The third day of training is given on the first day of enumera- 
tion week of the first month in which the interviewer works. On this 
day the supervisor observes actual interviews, checks some of the listing 
work, reinterviews some of the households, and then continues such 
follow-up training as he believes desirable. 


*Starting in May 1956, the publication of results has been based on a larger sample, now com- 
prising about 35,000 interviewed households in 330 primary sampling units. A brief description 
of the expansion appears in The American Statistician, April 1956, pp. 5 and 6. 

**This staff also is responsible for the field collection of information for the monthly retail trade 
survey and also works from time to time on other census surveys. 
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In the second month, the interviewer receives some home study 
materials and exercises, for completion and return for review and 
correction. ‘The supervisor then accompanies the interviewer for about 
a day during enumeration week and observes his work, reinterviews 
some of the households, and discusses with the interviewer problems 
that he observes. In the third month home study materials are again 
used, but usually there is no supervisory visit unless the caliber of the 
interviewer’s previous work has indicated a need for special attention. 

Regular training of experienced interviewers follows this pattern: 
About four times a year, group training sessions are held. In these the 
supervisors review the problems arising in the enumeration and discuss 
specific new types of questions which the interviewers are to use for 
supplementary information. They stress such things as approaches in 
interviewing, completeness of coverage, preparation of forms, and other 
pertinent matters. In other months training is by mail and is adjusted 
to cover any supplemental inquiries for that particular month. Home 
study exercises are provided to the interviewer who completes and 
sends them to the supervisor for review. (Interviewers are paid for a 
stated number of hours to be devoted to these home study exercises. 
They also receive pay for attending group training sessions.) In addition 
to this, supervisors schedule about two visits a year to observe the work 
of trained interviewers and to see how they approach their work and 
how they conduct themselves in obtaining the information. The 
types of controls possible through these training and observation 
periods are usually fairly subjective, since they depend so much upon 
the ability of the supervisor to detect inadequacies and to correct them. 
The group training sessions are designed to stimulate interest, to moti- 
vate the interviewer to do a good job, and to show him the proper 
procedure for carrying out his assignments. 


3. QUALITY CONTROL ON FIELD OPERATIONS OF THE 
CURRENT POPULATION SURVEY 


In an attempt to control the quality of the results of field interviewing 
in the Current Population Survey we have instituted a formal quality 
control and quality check procedure. This has two objectives. First, 
it is a device to control the quality of work of individual interviewers; 
and second, it provides a check on the overall quality of coverage as 
well as content of the survey. 

The system of quality control has been designed to identify those 
interviewers whose work is beyond acceptable limits of performance, 
so that they may be singled out for retraining or other administrative 
action. Our basic policy is to concentrate the supervisory work on 
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those interviewers whose work is poorest—and to give less attention to 
interviewers whose work is satisfactory. With a relatively stable 
interviewer force, this approach would seem to yield maximum results 
for a given amount of supervisory effort and cost. The importance 
attached to the issue of error control can perhaps best be summarized 
by noting that of every $100 spent on enumeration, about $11 is spent 
on checking the quality of the information collected and retraining 
interviewers whose work is unacceptable; an additional $18 is spent on 
regular training and observation. 

Our quality control operation is carried through by supervisory 
personnel and involves redoing the listing and reinterviewing the house- 
holds in a subsample of the areas covered by the survey. There is a 
presumption that the supervisors can list and interview better than the 
interviewers and that their work can serve as a norm against which to 
judge the quality of the work of the individual interviewers. In any 
one month the subsample consists of about one-fourth of the inter- 
viewers, and about one-third of the work of each of these interviewers is 
covered. To provide an objective method of choosing the work to be 
reviewed, a subsample of the 230 areas is selected (taking into account 
the relative workloads); and within this selected subsample of areas a 
subsample of the segments is predesignated for check. In determining 
the segments to be selected for this check, clusters of segments are 
arranged so that any one cluster tends to be essentially the work of a 
single interviewer and is of sufficient size to enable detection of poor 
work. © 

When the original interviewer transmits his completed schedules 
to the district office, a clerk in the office transcribes the information 
from each schedule to a reconciliation form. These reconciliation forms 
are then placed in an envelope which is put at the disposal of the person 
conducting the check for use only for reconciliation after the reinterview 
has been completed. 

The supervisor recanvasses the segment and checks the living 
quarters he finds within the segment against the lists originally prepared. 
Errors that are found in coverage of households are recorded. Within 
the sample households he is instructed to make a check of coverage 
of persons and to conduct an independent reinterview using a schedule 
identical with the one used originally. Wherever possible, the reinter- 
view is conducted with the person regarding whom the information is 
being obtained; second choice is the original respondent, and next any 
other acceptable respondent. Then the supervisor compares the 
results of the reinterview with the information on the reconciliation form. 
Differences are called to the attention of the respondent and an attempt 
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is made to determine the correct answer. If the reinterview answer is 
confirmed the supervisor tries to find the reasons for the discrepancy. 
In addition, the supervisor calls all remaining differences to the attention 
of the interviewer. The reconciliation reduces the discrepancies origin- 
ally found in the content reinterview by about 20 percent. The reinter- 
viewing is done, on the average, about a week following the original 
interviewing. 

This program gives the supervisory staff a measure of quality of 
the work of each interviewer. In keeping with the objective of quality 
control (which is to keep the overall rate of error within some pre- 
determined acceptance limits) the differences found in the work of each 
interviewer are summarized and a determination is made of the ade- 
quacy of his work with respect to (1) coverage of units, (2) coverage 
of persons within households and (3) the subject matter content. At 
present the control for each of these is set so that 95 percent of the 
enumerators with gross error rates of 5 percent will be considered 
acceptable. 

Those interviewers whose work is unacceptable on the basis of any 
of these measures of quality are supposed to be retrained for one day 
prior to or on the first day of the next enumeration period. In this 
training session the supervisor discusses with the interviewer the matter 
of his problems and shows how to overcome his difficulties. A supple- 
mentary check is to be made the following month to determine whether 
the interviewer’s work is acceptable. Those interviewers whose work 
is not under-control after such additional instruction are to be replaced 
or given further training. 

Let us examine some of the results of this check operation to date. 
First let us consider the quality control aspects in terms of the work of 
individual interviewers. Table 1 gives a distribution of monthly gross 
“error’’ rates of interviewers for a 12-month period May 1954 through 
April 1955. As can be seen, a relatively small proportion of the inter- 
viewers made a large proportion of the errors in coverage. On the 
other hand, discrepancies were not so concentrated with respect to 
subject matter, although only a small proportion of the interviewers 
are deemed unacceptable. 

Table 2 shows in summary the results of some efforts to classify 
errors in subject matter by cause. These classifications are assigned by 
the check interviewer at the time of the reconciliation, and must be 
regarded as more or less superficial. They may, nevertheless, provide 
some guide to the sources of differences. On the basis of this analysis, 
about one-fourth of the differences between the original survey and 

the reinterview arose when a different respondent was reinterviewed. 
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TABLE 1 
DIstRIBUTION oF INTERVIEWERS’ Gross “ERRoR” Rates AND or Gross 
Dirrerences Rerortep, CPS Montuty Recugeck, May 1954 Turoucu 
Aprit 1955 


Frequency in monthly 


assignments checked Gross differences reported 
Gross error rate _————————— ee 
(percent) Cumulative Cumulative 
Actual percentage Number percentage 
Coverage: 
0 558 100 =. 
0.1- 0.9 1 15 3 100 
1.0- 1.9 rd 15 a 99 
2.0- 2.9 7 14 7 97 
3.0- 3.9 6 13 7 95 
4.0- 4.9 it 12 12 94 
5.0- 5.9 12 ial 14 90 
6.0- 6.9 5 9 12 87 
7.0— 7.9 11 8 13 83 
$:0—78°9 8 6 25 80 | 
Oat= 8).9) 4 5 8 73 
10.0-14.9 10 5 25 zal 
15.0-24.9 | 8 3 53 65 
25.0 and over 12 2 189 50 


Total | 660 ae ann aes =e 
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Approximately a third of the differences arose because either the respond- 
ent misinterpreted the question or the interviewer misinterpreted the 
answer. 

Relatively few differences seem to arise because the questions are 
not asked properly or because the interviewer misrecorded the answer. 
About one-third apparently resulted even though the respondent 
understood the question: in some cases it appeared that he was thinking 
of a different survey week, forgot part of the correct answer, or erred 
in his calculations. Thus, one-half or more of all differences seem to 
arise because the respondent was at fault, and perhaps only about one- 
fourth of the differences can be attributed to errors by the interviewer. 
Consequently, the gross differences do not provide a sensitive indicator 
of the quality of an interviewer’s work, and their use as a measure of 
the performance of interviewers may lead to inefficient use of resources. 
This analysis raises a question whether the quality measure of inter- 
viewer performance should not be based on differences classified as 
interviewer errors in the check interview. Then perhaps the remaining 
differences, when they are large, can be regarded as indicating the 
existence of difficult interview situations, and additional training of 
interviewers can be focused on insuring that the respondent is led to 
give the proper answer or to understand what is wanted from him. 


TABLE 2 
DISTRIBUTION OF APPARENT CAUSES OF DIFFERENCE IN CLASSIFICATION 
BretwEEN CPS OricinaL AND CHECK INTERVIEW, FOR IDENTICAL 
InpivipuaLs, May 1954 TuHrovues Fresruary 1955 


Apparent cause of difference Percent (of 257) 
Different respondent 27.3 
Due to enumerator 22.5 

Questions not asked properly isi) 

Enumerator misinterpreted answer Ip al 

Enumerator misrecorded answer 2.3 

Due to respondent 47.5 

Respondent misinterpreted question 12.8 
Respondent understood question 

but reported incorrectly 84.7 

, Other 2.7 


Total 100.0 
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We have hesitated to take this approach, and have not done so yet, 
because of the dubious nature of the classification of the cause of the 
differences. 

These data are only fragmentary, and as we continue to study 
further results we hope to learn more about the nature of the problem. 
It is interesting to note that in about half the March 1955 cases where 
the interviewers’ work was unacceptable on the basis of the measure 
of discrepancies from the check, the supervisor did not carry through 
the retraining and reported that he did not regard it as desirable or 
necessary. 

The results of the reinterviews are also summarized to provide an 
overall measure of the quality of the interviewing. Table 3 shows the 


TABLE 3 


Montuity AVERAGE Resuuts FoR IDENTICAL PERSONS INCLUDED IN ORIGINAL 
AND REINTERVIEW SAMPLE, May 1954 THrouen Aprint 1955 


Monthly average num-| Average net 
ber of persons reported|difference, (2)—(1)} Percent 
Employment status ———— identically 
In original} In check | Number} Percent} reported 
interview | interview 
(1) (2) (3) (4) (5) 
Labor Force 1,185 20k i 16 1.3 Oe 
Employed 1,123 ie By 14 1.2 98.1 
Agriculture 127 131 4 3.0 94.7 
Nonagriculture 996 1,006 10 LO 98:5 
Full-time (worked 35 
hours or more) 729 726 ==) =— A 97.5 
Part-time (worked 
less than 35 hours) 215 226 11 4.9 87.9 
With a job but not at 
work 52 54 2 3.7 85.6 
Unemployed 62 64 2 3.1 87.8 
Not in Labor Force 959 943 =16 Hiv 98.9 


results of the original interviews and the reinterviews over the 12-month 
period May 1954 through April 1955. It also shows the proportion of 
persons identified in a particular class in the check who are identified 
in that same class in the original interview. This table relates only to 
persons covered by both the original and reinterview and thus excludes 
about 15 percent of the eligible persons (those not covered because of 
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listing errors, non-interviews in either the original survey or the check, 
and persons who moved in the interval between the two interviews). 
Only the difference for persons working part-time and that for aggre- 
gates affected by this class appear to be statistically significant. The 
comparatively small net differences reflect the fact that many of the 
gross differences tend to be compensating. 


.3 
os EMPLOYED IN AGRICULTURE WORKING - NON AG, = 
: +2 FULL TIME 
0 SS ee ee SS ol 

+—+—+ - +++ +—§—1—_1—_ + + ie) 
META SOLED Se kr MA. MF ASONDG FEA 


WORKING - NON AG, - PART TIME WITH A JOB = NON AG, = NOT AT WORK 


INDEX OF SHIFT 
b bw 
| 


0 —$—_$_}—_+_+-_—_+—_ + —_ +. 
MIJAS ONDIFMA 
1954 1955 
et 
6 UNEMPLOYED 
5 
x MITASONDIFMA 
= 1954 1955 
22 22 
NOT IN LABOR FORCE 
ay el 
Sp ee Se 
ye geek Bo eee 2 SOK 
A MA MIJAS ONDJFMA 
1954 1955 1954 1955 


FIGURE 1. INDEX OF SHIFT OF LABOR FORCE STATUS ITEMS, CPS RECHECK, 
AFTER RECONCILIATION, MAY 1954 THROUGH APRIL 1955 


- Fig. 1 examines the reinterview results in a somewhat different form. 
The index of shift is the ratio of the gross differences in a given category 
to the number in the category as determined by the reinterview. Thus 
a high index of shift indicates great instability in the classification of 
individuals into the category. The groups with marginal attachments 
to the labor force such as the part-time employed, those with a job 
but not at work and the unemployed have the greater instability. 
One of the things which we have been speculating about is the degree 
to which the index of shift especially in the case of unemployment, 
may reflect the variability in classification over time caused by the high 
seasonal incidence of persons with relatively small attachment to the 
labor force. 

We have also examined some of the net differences to determine 
whether on an overall basis the results of the Current Population Survey 
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DIFFERENCE AS PROPORTION OF RECHECK 
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can be considered to be under control. Figs. 2a and 2b show net differ- 
ences between the original interviews and the reinterviews. (Points 
above the horizontal axes denote a net relative understatement in the 
original Current Population Survey enumeration; points below denote 
a net relative over-statement.) For the most part, the points on the 
chart seem to indicate that the various statistics are under control, 
although for the unemployed group (Fig. 2b, right) one may speculate 
whether the results in July and November 1954 may not suggest some 
problems for those months. 

Experience to date is too limited to give any assurance that this 
system of quality control will adequately signal trouble when it arises. 
The reinterview and reconciliation procedures that we have been 
using do not directly enable us to distinguish those errors or differences 
caused by the respondent from those caused by the interviewer. While 
our efforts have been designed to keep our net errors small, the quality 
control features of this program have been aimed toward reducing the 
gross differences. However, it appears necessary for us to direct more 
attention to the kinds of errors contributed by respondents. Also, 
intensive work on the evaluation of coverage has not been sufficient to 
explain consistent differences of about 2 +e-4 percent in the total popu- 
lation estimated from the Current Population Survey and that esti- 
mated on the basis of the last census. We are not yet satisfied that our 
intensive measurement methods are adequate. 


4. FUTURE WORK 


The check operation discussed represents our major effort so far 
to ascertain differences objectively, and to control them by determining 
the sources of contribution to error and by taking steps to deal with the 
problems identified. We have been experimenting with more intensive 
types of reinterview schedules in an effort to determine whether we can 
more clearly ascertain reasons for differences. Fragmentary results 
now suggest that these more intensive interviews do lead to improved 
understanding, although no substantial changes in error or gross 
difference rates have been observed from their use in our limited ex- 
perience. One observation is that the use of a check list which more 
precisely defines the concepts to the respondent may help in carrying 
out both an initial interview and reinterview. We are currently con- | 
sidering the possibility of using such a check list in the original inter- 
views. 

A great deal of research still remains to be done. We are conducting 
additional studies to learn more about sources of error in our surveys 
and about the effectiveness of our supervisory activities. Much still 
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needs to be learned as to the relative advantages or disadvantages of 
the different steps taken, which are based on intuitive grounds and for 
which, to date, little quantitative information is available. For example, 
what should be the methods and the periodicity for group training? 
What should be the periodicity for observations? What other type of 
measures do we need to institute? What incentives can we provide so 
that the persons working at different stages of a survey can be properly 
motivated and so that the types and number of errors can be controlled 
to a desired level? And last but not least, how can we insure that the 
respondent can be ‘‘trained”’ to give the information which we need in 
order to provide the data required? 

Survey design has come a long way. Sample design has been studied 
so that control of its contribution to the mean square error of survey 
estimates is well understood. We have yet a lot to learn about control 
of the other contributions to the mean square error. This is a report 
of progress on approaches to such control in one continuing survey, 
and on some of the problems and difficulties still to be resolved. 


THE STUDY OF THE PHYSIOLOGICAL EFFECTS OF HOT 
CLIMATES 


J. O. IRwin 


Statistical Research Unit of the Medical Research Council, London School of Hygiene 
and Tropical Medicine, England. 


1. THE PROBLEM—EFFECTIVE TEMPERATURE 


“One of the physiological requirements for health is the maintenance of a 
practically constant body temperature. In man the average mouth temperature is 
about 98.4° F. and the temperature of the deep tissues is about 99° F. The body 
temperature is controlled by physiological mechanisms which regulate the rate of 
heat loss from the body surface. It is through the operation of these mechanisms 
that the temperature of the body remains nearly constant even though the warmth 
of the environment varies over a considerable range, and over long periods measure- 
ments of heat production and heat loss correspond very closely. But in hot environ- 
ments the body may not be able to get rid of heat as fast as it produces it, then the 
body temperature will rise, and if the exposure is prolonged, serious results may 
ensue.” 

“The human body loses heat by three paths, radiation, convection and evapora- 
tion. The environmental factors which affect the rate of heat loss are the temperature, 
humidity and rate of movement of the air, and the radiation from the surroundings, 
but the rate of loss is largely governed by the physiological mechanisms which serve 
the body as thermostatic controls.” Bedford [1946]. 


Thus in comparing two thermal environments we have to take 
account of temperature, humidity, rate of air movement and radiation. 
The advantage, indeed the practical need, of having a single index of 
thermal environment became apparent a long time ago. This need 
was to a considerable extent met by the index known as “effective 
temperature”. Originally, effective temperature was designed to take 
account of the temperature, humidity and rate of movement of the air 
and was a measure of subjective feelings of comfort. It was based on 
experiments carried out at the Research Laboratory of the United 
States Bureau of Mines, Pittsburgh, by Houghten and Yagloglou [1923] 


and their colleagues. 


“Trained observers passing between two controlled climate rooms identified 
widely varying combinations of dry and wet bulb temperatures which gave rise to- 
the same subjective sensations of comfort. The conditions in one room were kept 
constant and in the other the temperatures were raised or lowered until the tempera- 
ture was identified at which the sensations in the two rooms corresponded accurately. 
The walls of the climatic rooms were at substantially the same temperature as that 
of the air. ‘Equal comfort lines” for “still air” fitting the corresponding wet and 
dry bulb temperatures in the two rooms were superimposed on a standard psycho- 
metric chart. The effects of higher air velocities were then eee: and align- 
ment charts were constructed for assessing “effective temperature which related 

conditions under examination to the corresponding temperature of ~ and saturated 
air at which the same effect on thermal comfort would be experienced. (Ellis [1953]). 
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Effective temperature is therefore the temperature of still air saturated 
with water vapour in which a sensation of warmth, equivalent to that 
experienced by those in a particular environment under investigation, is 
on the average reported by the subjects in a long series of tests. Since 
clothing reduces the effect of air movement, a ‘normal scale’ was 
constructed for persons wearing ordinary indoor clothes and a “‘basic 
scale’ for persons stripped to the waist. The limiting conditions to 
which men might be exposed without sustaining ill-effects were said 
to be an effective temperature of 90° F. if they were at rest and 80° F. 
if they were engaged in heavy work. 

The original scale of effective temperature took no account of 
radiation. Bedford pointed out that it can be corrected approximately 
for the effects of this factor if the globe thermometer temperature is 
used instead of the dry bulb temperature in calculating the effective 
temperature. The globe thermometer is an.ordinary thermometer 
with its bulb at the centre of a hollow 6-in. metal sphere coated with 
matt black paint. If the walls and other surfaces which surround the 
globe are warmer than the air, the temperature recorded by the ther- 
mometer inside the globe is above air temperature. Consequently, the 
readings of the globe thermometer make some allowance for radiation 
from the surroundings. 

The effective temperature scale was devised well before the outbreak 
of the second world-war. Problems of excessively hot environments 
aroused a great deal of attention during that war because these con- 
ditions were experienced in warships serving in tropical waters. 


2. PREDICTED FOUR-HOUR SWEAT RATE 


In 1944 the British Medical Research Council was asked by the 
British Admiralty to investigate the climatic effects of the working 
conditions prevailing between decks in tropical waters. This led to 
the establishment of the Tropical Research Unit at Singapore in Jan- 
uary 1949. 

Meanwhile a team of civilian workers and naval medical officers at 
the National Hospital, London, under the leadership of Dr. McArdle 
observed the physiological effects of heat on men at rest or working at 
fairly high rates stepping on and off 12-inch stools. Observations were 


made on young naval ratings after they had been acclimatised “arti- 


ficially”’ to work at high temperatures by daily work under hot condi- 
tions for two or three weeks. These young men were then able to 
work satisfactorily at tasks involving energy expenditures similar to 
those of gun crews during a bombardment and whilst wearing anti-flash 
protective clothing, when the wet bulb temperature was as high as 
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90° F. The findings in general supported Haldane’s [1905] view that 
the level of the wet bulb temperature was a useful guide to the thermal 
tolerability of an atmosphere, and showed that the upper permissible 
levels of wet bulb temperature could be raised by 2° to 3° F., if men 
worked stripped to the waist in shorts instead of wearing overalls, or 
alternatively if the air movement was raised from 20 ft/min. (still air) 
to 200 or 300 ft/min. (for men wearing shorts or overalls respectively) 
under hot humid conditions. On the other hand, they found that 
little was to be gained by increasing the air movement above 300 ft/min. 
under the conditions of these experiments. It was accepted that these 
limiting conditions would be lowered if the temperature of the walls or 
surroundings exceeded the air temperature, if the men worked more 
vigorously, if they were already fatigued or untrained, or if the subjects 
were old men or less well acclimatised to work at high temperatures. 

The standard effective temperature charts apply only to men 
engaged in light or sedentary activities, and the workers at the National 
Hospital found that they were inaccurate for predicting the relative 
physiological stress experienced by men working under relatively warm 
conditions. Rises of dry-bulb temperature and falls of wet-bulb tem- 
perature for which the effective temperature was unaltered resulted in 
a decrease of stress. They argued therefore that the influence of the 
dry-bulb temperature was over emphasized at the expense of the 
wet-bulb temperature. There was also insufficient correction for the 
deleterious effects of low air movement (< 100 ft/min.) in hot humid 
conditions, especially when the subjects were clothed and the wet 
bulb temperature was high. Further in hot dry conditions when the 
air movement is increased from 20 to 200 ft/min. the effective tempera- 
ture scale indicated progressive improvement with increasing air 
movement instead of deterioration. 

On account of these defects in effective temperature McArdle and 
his colleagues were led to construct an alternative index of stress based 
on sweating rates. Using the results of nearly 1000 individual experi- 
ments they constructed an empirical nomogram from which the ‘“‘pre- 
dicted 4-hour sweating rate’’—P4SR—for any set of working conditions 
could be ascertained, provided the environmental factors, the metabolic 
cost of the work and the clothing worn were known. ‘The nomogram 
gives the P4SR for given wet-bulb temperature (°F.), dry-bulb tem- 
perature (°F.), air velocity (ft/min.) and metabolic rate (keal./m/hr.). 
Predicted 4-hour sweat loss would have been a better term, for it 
is the amount of sweat lost in 4 hours which is predicted from the 


nomogram. 
This was the position when the Tropical Research Unit at Singapore 
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started work in January 1949. Its main task was to test the conclusions 
of the workers at the National Hospital, London, with young men as 
subjects who were naturally acclimatised to living in the Tropics. 
Among the objectives at Singapore were: 


(1) To investigate in the Tropics the ability of men to withstand 
varying combinations of 
(a) differing dry and wet bulb temperatures and air movements 
(and later varying mean radiant temperature—air tempera- 
ture gradients), 
(b) different clothing systems, 
(c) work involving various metabolic costs, 
(d) varying periods and intervals of exposures to these con- 
ditions. 
(2) To estimate the predictive accuracy of the ‘‘P4SR”’ scale. 
(3) To assess the value of the Effective Temperature scale for 
grading the severity of thermal conditions in relation to human 
activities in the Tropics. 


Physiologists have attached great importance to obtaining a com- 
prehensive index of stress. Effective temperature, for instance, is 
widely used by ventilation engineers. P4SR is another such index. 

Really, of course, this is a double discriminant function problem. 
On the one hand, we have certain environmental variables: wet bulb 
temperature, dry bulb temperature, air velocity, work rate, type of 
clothing. These might be called the stress variates. On the other, we 
have variables which measure the physiological responses such as 
sweat rate, temperature, pulse rate. These might be called the strain 
variables, to use a distinction with an obvious physical analogy, sug- 
gested by my colleague, Dr. Macpherson. The problem is to find the 
best functions of each set of variables for making a prediction of strain 
from stress. I doubt whether canonical correlation technique could be 
used, but to state the problem in this form makes the aim of the work 
clear. 

One needs to emphasise that in this sense P4SR is an index of stress, 
not of strain. Once the nomogram had been constructed, P4SR was 
effectively a function of the climatic variables and work rate, and its 
object was to predict strain. 


3. AN EXPERIMENT 


I propose to describe the first experiment which was carried out at 
Singapore. The object was to determine the effects on men naturally 
acclimatised to the Tropics of exposure for four hours twice weekly 
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to varying combinations of air temperature, humidity and air movement. 
Though not an ideal experiment, it was, I think, a great deal better than 
anything which had been done before. In particular, it shows how one 
may have to adapt statistical analysis to a situation which is by no 
means envisaged in a text book. 

Table 1 shows the combinations of air velocity, dry bulb temperature 


TABLE 1 
ENVIRONMENTAL CONDITIONS OF THE EXPERIMENTS 


Wet bulb temperature (°F.) 
Dry bulb 
temp. (°F.) | 
80 83 85 88 
1 — 1 — 
90 2 — 2 — 
3 3 3 3 
4 — 4 — 
100 3 3 3 3 
1 — 1 — 
120 2 == 2 = 
3 3 3 3 
4 — 4 aa 


Numbers 1-4 denote air movements as follows: 


1 = 44 ft/min. 
2 = 86 ft/min. 
3 = 300 ft/min. 
4 = 500 ft/min. 


and wet bulb temperature actually used. They were designed to cover 
the same range as had been investigated in London. Originally a 
4X 3 X 2 factorial arrangement had been suggested, but this was 
modified for technical reasons. If 83° and 88°F. wet bulb temperatures 
had been tested instead of 80° and 85° at all air velocities the design 
would have been easier to analyse, but it was felt that this advantage 
was outweighed by the possibility of men collapsing while working 
at 88° F. wet bulb temperature, and such collapses, in themselves 
undesirable, would have complicated the statistical treatment of the 
results in another way. . 
There were 3 teams with 4 subjects in each—young naval ratings 
who volunteered from ships or shore establishments on the Far East 
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Station. Each team had two 4-hour periods a week in the hot room. 
Four work-clothing combinations were tested at each exposure: working 
in shorts; working in overalls; resting in shorts; resting in overalls. 
Work consisted in step-climbing according to a certain routine. These 
four categories have been called ‘“Postures’’ for convenience. They ~ 
may be allocated to 4 subjects in 24 different ways, and one of these 
was assigned randomly to each of the 24 climate combinations, separate 
randomisations being used for each team (Table 2). 


CoMBINATIONS OF CLIMATE AND PosTURE 


Climate combinations randomised 
(Separately for each team) 


I 
II 
INDE 


ae oe 
a 


Persons randomised 
1 2 3 4 


acooes 
eaneo 
rea ae 
aco 


oacre 
anvras 
aca o 


ecaes 
cocoa k 
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In this plan all separate climate and posture comparisons were 
unconfounded with differences between persons and were therefore 
equivalent to comparisons on the same persons. The error term for 
these was originally intended to be based on such climate-posture 
interactions as were not confounded with personal differences. In 
testing the climate-posture interactions, it was proposed originally to 
ignore differences between persons within teams. It was not realised 
that these interactions were as important as they proved to be. Further, 
by what was subsequently recognised to be an error of judgment, the 
12 subjects were, on the basis of a uniformity trial carried out before the 
main trial started, divided into four grades of sweating with three 
subjects in each and one member of each grade was put in each team. 

Thus, there were two difficulties to be faced in the analysis. The 
arrangement of the climatic variables was not factorial, and it was not 
possible to allow for personal differences by the analysis of variance 
itself. The only way to correct for differences between persons was by 
analysis of covariance on the basis of the uniformity trial. To meet the 
other difficulty, the results of the trial were divided into two distinct 
sections (referred to in this report as Sections A and B) the first con- 
taining all combinations of 90° and 120° F. dry bulb temperature, 
80° and 85° F. wet bulb temperature and the four air velocities, and 
the second containing all combinations of 90, 100, 120° F. dry bulb 
temperature, and 80, 83, 85, 88° F. wet bulb temperature at the third 
air velocity (300 ft/min.). The analysis of covariance was carried out 
separately for the two sections. The combinations of 90° and 120° F. 
dry bulb temperature with 80° and 85° F. wet bulb temperature at an 
air velocity of 300 ft/min. occurred in each. 

The statistical analysis was carried out for a number of response 
variables:— total sweat loss, total sweat loss per square metre of body 
surface, evaporative water loss (absolute and per square metre), final 
rectal temperatures, final pulse rates (seated and standing), comfort 
ratings and efficiency ratings. 

Whenever an effect was found to be significant in the analysis of 
covariance, tables of the relevant means and their standard errors were 
compiled with a short paragraph headed “‘remarks” briefly describing | 
each effect. In this way the salient features of the results were treated 
systematically for the first time, I believe, in this particular field. ii 
do not propose to trouble you with the details of the analysis of co- 
variance, but I give examples in Tables 3 and 4 of the manner of pre- 
senting the results. These particular tables refer to “Total Sweat Loss”. 

The important conclusions from Tables 3 and 4 are ak 
(1) The direct posture effects are not completely informative if we do — 
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TABLE 3 (SECTION A) 
Tora, Sweat Loss: AVERAGE PostuRE AND AVERAGE CiIMATE ErrEcts 
(Mean values in grammes. No correction necessary) 


(7) Posture Means 


(a) Shorts working 2582 (c) Overalls working 2721 
(b) Shorts resting 1475 (d) Overalls resting 1409 


Shorts—Overalls —37 
Working—Resting 1210 
Interaction —103 


Standard error 38 (for each of the above means) 


(ti) Climate Means 


Air Wet bulb Dry bulb 
velocity temperature (°F.) Mean temperature (°F.) Mean 
(CL IN Se 
80 85 — 90 120 
44 1964 2444 2204 1255 3152 2204 
(480) (1897) 
86 1768 1992 1880 1053 2707 1880 
(224) (1654) 
300 2045 1918 1982 947 3016 1982 
(—127) (2069) 
500 2041 2204 2122 805 3440 2122 
(+163) (2635) 
Mean | 1955 2140 | 2047 1015 ~~» 8079 2047 


cs ey Standard error of single means 53 
Remarks: The direct posture effects are not completely informative if we do not oer tous 
interactions with climate as well. The excess sweat rate of working over rating subjects isthe only 


important significant average effect; it is a little greater i in overalls than in shorts. — 


HOT CLIMATES 483 


TABLE 3 (SECTION B) 
Totat Swear Loss: AVERAGE PostuRE AND AVERAGE CrIMAatTEe Errects 


(Mean values in grammes. No correction necessary) 


(t) Posture Means 
Air velocity 300 ft/min. 


(a) Shorts working 2350 (c) Overalls working 2612 
(6) Shorts resting 1477 (d) Overalls resting 1387 
Shorts—Overalls —86 
Working—Resting 1049 
Interaction —177 


Standard error, 38 (for each of above means) 


(zz) Climate Means 
Air velocity 300 ft/min. 


Wet bulb Dry bulb temperature (°F.) Mean 
temperature (°F.) 90 100 120 

80 990 1606 8101 1899 

83 969 1577 3015 1854 

85 904. 1734 2932 1856 

88 1115 1867 3666 2216 

Mean 994 1696 3179 1956 


Standard error of single means, 66. 


Remarks:The following effects may be noted when the air velocity is 300 ft/min. The direct 
posture effects are not completely informative if we do not consider the interactions with climate as 
well. The excess sweat rate of working over resting subjects is the only important significant average 
effect. It is a little greater in overalls than in shorts. 

The average sweat rate increases greatly and steadily as the dry bulb temperature increases from 
90 to 120°F. When the wet bulb temperature increases from 80 to 88°F. there is a small fall followed 
by a rise which is somewhat greater at the higher dry bulb temperatures. 


On the average, the sweat rate is a little greater with the wet bulb 
temperature at 85° F. than at 80° F. and very much greater when the 
dry bulb temperature is at 120° F. than at 90° F. 

When the air velocity was 300 ft/min. the sweat rate increased 
greatly and steadily as the dry bulb temperature increased from 90° 
to 120° F. When the wet bulb temperature increased from 80° to 
88° F. there was.a small fall followed by a rise which is somewhat 
greater at the higher dry bulb temperatures. 

(3) The average sweat rate in shorts is greater than in overalls when 
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TABLE 4 (SECTION A) 
ToraL Sweat Loss: INTERACTIONS OF PosTURE AND CLIMATE 
(Mean values in grammes, corrected for personal differences in uniformity trial) 
(i) Air Velocity: Dry bulb Temperature: Posture 


Dry bulb temperature (°F.) 


Air 
velocity 90 120 
(ft/min. ) (a) (b) (c) (d) |Sh—O|| (a) (b) (c) (d) |Sh—O 
44 1791 494 2045 . 691 |—226 || 3746 2177 4155 2531 | —382 
86 1499 459 1705 550 |—148 |) 3234 2120 3593 1881 | — 60 
300 1305 459 1493 531 |—130 || 3601 2698 3531 2236 | +266 
500 1198 303 1372 347 |—109 || 4283 3095 3876 2504 | +499 


All standard errors for single means are between 106 and 110 


Air Difference in loss (120-90°F. dry bulb) 
velocity 

(ft /min.) (a) (b) (c) (d) Sh—O W-—R Interaction 

44 1955 1683 2110 1840 —156 271 + 1 

86 1735 1661 1888 1331 + 89 316 —242 

300 2296 2239 2038 1705 +396 195 —138 

500 3084 2793 2503 2158 +608 318 — 27 

Mean 2268 2094 2135 1759 +234 275 —102 


All standard errors for single means are between 150 and 160 


(i) Air Velocity: Wet Bulb Temperature: Posture 


Air Difference in loss (85-80°F. wet bulb) 
velocity 

(ft/min. ) (a) (b) (c) (d) Sh—O W-R _ Interaction 

44 849 179 339 555 67 227 443 

86 29 —65 671 261 —484 252 —158 

300 =21 —213 —61 —215 21 173 19 

500 86 82 448 34 157, 209 —205 

Mean _ 236 —4 349 159 —138 215 12 


All standard errors for single means are between 150 and 160 
(a) = Shorts working (c) Overalls working 


(b) = Shorts resting (d) = Overalls resting 
Sh = Shorts W = Working 
O = Overalls R = Resting 
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Remarks: (i) At air velocities of 300 and 500 ft/min. the average sweat rate in shorts is greater 
than in overalls, when the dry bulb temperature is 120°F.; but when it is 90°F. the reverse is true. 
The ‘‘cross-over”’ is at a point below 100°F, for resting subjects but higher for working subjects. At 
the two lower air movements, the sweat rate is higher in overalls than in shorts, at both 120 and 90°F. 

(ii) The especially large values —484, +443 need discussion. With an air velocity of 86 ft. /min. 
the sweat rate is on the average lower in overalls than in shorts when the wet bulb temperature is 
80°F. but higher in overalls than in shorts when it is 85°F. This is mainly due to the anomalous 
values for shorts for the 85°F. wet bulb/120°F. dry bulb temperature combination, 

When the air velocity is 44 ft./min. and the dry bulb temperature 120°F., in working subjects 
the sweat rate with overalls is appreciably above that with shorts when the wet bulb temperature is 
80°F. and about the same at 85°F.; in resting subjects the reverse is the case. This effect does not 
occur at this air movement when the air temperature is only 90°F. 


TABLE 4 (SECTION B) 
Torat Sweat Loss: InreRacTIONS oF PostuRE AND CLIMATE 
(Mean values in grammes, corrected for personal differences in uniformity trial) 
(t) Posture: Dry Bulb Temperature (Air Velocity 300 ft./min.) 


Dry bulb Posture 
temp. (°F.) (a) (b) (c) (d) Sh—O W-—R Interaction 
90 1416 444 1536 582 —129 963 9 
100 2055 1213 2378 1128 —124 1041 —199 
120 3578 PHS 3921 2440 —4 1142 —329 


All standard errors for single means are between 65 and 73 


(zi) Posture: Wet Bulb Temperature (Air Velocity 300 ft./min.) 


Wet bulb Posture 
temp. (°F.) (a) (b) (c) (d) Sh-—O W-—R Interaction 
80 2329 1490 2394 1384 21 925 — 86 
83 2190 1461 2430 1335 — 57 912 —183 
85 2335 1420 2421 1251 42 1043 —128 
88 2545 1539 3203 1579 —349 1315 —309 


All standard errors for single means are between 75 and 84 


Remarks: (i) The following effects may be noted when the average air velocity is 300 ft. /min. 
The sweat rate is on the average greater in overalls than in shorts when the dry bulb temperature is 
90 and 100°F. but not when it is 120°F. The sweat rate in working subjects is greater in overalls 
than in shorts and increasingly so with a rise in dry bulb temperature (At 120°F., this result is different 
from Section A; this is due to the inclusion of the results for 88°F. wet bulb temperature in the Ba 
in resting subjects—it is less in overalls than in shorts except at the dry bulb temperature of 90°F. 

(ii) The sweat rate in shorts is on the average about the same as in overalls except at the wet 
bulb temperature of 88°F.; but at 88°F. the average sweat rate in overalls is considerably higher than 
in shorts. For working subjects the average sweat rate is higher in overalls than in shorts ue all four 
wet bulb temperatures; in resting subjects it is lower in overalls than in shorts, except at 88°F. where 


it is about the same. 
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the air velocity is 300 and 500 ft/min. and the dry bulb temperature 
is at 120° F.,* but at 90° F. the reverse is true. The “cross over” point 
is below the dry bulb temperature of 100° F. for resting subjects but 
higher for working subjects, for whom it depends partly on the level of 
wet bulb temperature. At the two lower air movements the sweat 
rate is higher in overalls than in shorts at dry bulb temperatures of 
both 120° and 90° F. 

When the average air velocity is 300 ft/min. the average sweat 
rate in working subjects is greater in overalls than in shorts for all four 
wet bulb temperatures and increasingly so with a rise in dry bulb 
temperature; in resting subjects it is less in overalls than in shorts 
except for the dry bulb temperature of 90° F. and 88° F. wet bulb 
temperature, when it is about the same. 


4, RELATION OF THE EXPERIMENTAL RESULTS TO P48R AND 
EFFECTIVE TEMPERATURE 


The following method was used to compare the “‘total sweat rates’ 
obtained from this series of experiments with the P4SR values obtained 
from the nomogram constructed by McArdle and his colleagues. 

In Section A there were available 64 climate-posture combinations; 
in Section B, 48. The regressions of y on x, were calculated for each 
series separately 


corrected mean total sweat 
P4SR value from the nomogram. 


where ¥ 
vy 


An analysis of variance was then performed on the deviations from 
regression, so that it was possible to see for which treatment combinations 
departures from the expected values were significant and to examine 
all such effects. 

The two regression lines proved to be: 


Section A Y = 2047 + 0.9783(x, — 1968) 
Section B Y = 1957 + 0.9248(a, — 1890) 


These two lines agree well; they do not differ significantly in position 
or slope. They both yield positive values for “total sweat’? when the 
P4SR is zero; these are 122 + 46 and 209 + 39. Both values are small. 
Had the lines been assumed to go through x, = 0, Y = 0 the slopes 
would have been 1.023 and 1.006. Thus, on the average, the total 
sweat rates in this experiment agree closely with those predicted from 


*This is true at 500 ft./min. for working and resting subjects separately. It is true for working 
subjects at 300 ft./min. if only the results for 80°F. wet bulb and 85°F. wet bulb are included in the 
average (though the difference is small); but not if all four levels of wet bulb are included. It is true 
for resting subjects at 300 ft./ min. 
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the nomogram for the same working conditions. The overall correlation 
coefficient between them is 0.96. 

Nevertheless, there are significant differences between the devia- 
tions from regression of particular treatment combinations, and these 
have all been examined carefully. 

The most important of these departures from expectation can be 
expressed in the following way. 

In London, men wearing overalls had a higher sweat loss than men 
wearing shorts over the whole range of climatic conditions. Con- 
sequently, the values predicted from the P4SR nomogram had this 
same property. In Singapore, however, while this was true at a dry 
bulb temperature of 90° F., at 120° F. the total sweat loss was in general 
greater for men wearing shorts at all the air velocities except the lowest. 
A similar method was used to compare the “‘total sweat rates’’ obtained 
from this series of experiments with their expected values calculated 
from the regression of y on “effective temperature.” 

The two regression lines are: 


Section A Y = 2047 + 208.4(2, — 86.50) 
Section B Y = 1957 + 209.6(@, — 86.05) 
with y = corrected mean total sweat 

x, = “effective temperature” 


The results for Section A and B agree closely, but whereas the overall 
correlation of total sweat with P4SR is 0.96, with effective temperature 
it is only 0.78. Thus the deviations of total sweat rates from the values 
expected on the basis of effective temperature are very considerably 
greater than on the basis of P4SR. These deviations were also examined 
carefully. It was apparent, however, that the main cause was the 
difference in position of the regression lines for working and resting 
subjects considered separately (the slopes were not significantly differ- 
ent). This might have been anticipated because the effective tempera- 
ture scale was originally constructed for resting subjects, and would not 
necessarily apply in the same way to working subjects. In fact, in 
terms of physiological reactions or of comfort an effective temperature 
of, say, 85° does not mean the same thing for working as for resting 
subjects. In other words, if separate regressions for working and resting 
subjects are calculated for P4SR the two regressions form one line, for 
effective temperature there is a difference in position but not in slope. 
Table 5 makes the matter quite clear. It gives the average correlations 
within the four work-clothing groups and for all four groups combined. 
- Within groups effective temperature is almost as good a predictor as 
P4SR and a rather better one for rectal temperature and pulse rate. 
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TABLE 5 
CoErFicrients or AVERAGE CoRRELATION, WiTHIN WoRK CLOTHING GROUPS AND 
ror ALL Four Groups ComBINED, OF ENVIRONMENTAL Inpicus P4SR AnD 
“BPrECTIVE TEMPERATURE” WITH PHYSIOLOGICAL OBSERVABLES 


Physiological P4SR Effective temperature* 
observable Within Groups Within Groups 
groups combined groups combined 
Total sweat 0.98 0.96 0.93 (0.91) 0.80 (0.78) 
End rectal temperature 0.84 0.74 0.90 (0.86) 0.48 (0.48) 
Pulse rate 0.86 0.84 0.90 (0.86) 0.63 (0.63) 


*These values allow for non-linearity of the regression; the values in brackets are the 1st order 
correlation coefficients. 


We have summarised the comparison of P4SR and effective tem- 
perature as follows: 


“The Singapore observations on sweat loss and changes in body temperature 
and heart rate in general support the conclusion of the London workers that the 
effective temperature scales are not completely adequate to describe the general 
pattern of changes in these variables for varying wet bulb temperatures and air 
velocities, but they indicate that the effective temperature scales over-emphasize the 
contribution of the wet bulb temperature to thermal stress and not the dry bulb 
temperature as the London workers suggested. 

If each work-clothing combination is taken separately, the predictive accuracies 
for these “naturally” acclimatised naval ratings of the effective temperature scales 
and the “predicted four-hour sweat rate’? nomogram constructed by the London 
workers are about the same, though there is a slight advantage to the latter in 
predicting sweat loss. However, when the results of all groups of experiments are 
combined, correlations with effective temperature are considerably lower than with 
“predicted four-hour sweat rate”. This is due to the fact that the effective tempera- 
ture scales make no allowance for differences in work rates. They were not designed 
to do so, being intended originally only for comparisons when clothing and work 
conditions were held constant. In this sense ‘‘predicted four-hour sweat rate” is a 
more comprehensive index. It also gives a more adequate picture of the change in 
stress with air movement. In hot conditions effective temperature does not allow 
adequately for the reduction in stress as air velocity is increased to 100 ft/min. and 
indicates too great an improvement at 200-500 ft/min. On the other hand, the 
concept of predicted 4-hour sweat rate can only be applied within the range of 
climate-work-clothing combinations which cause people to sweat. It cannot replace 
effective temperature under the more comfortable and desirable conditions of light 
and sedentary work with which it was designed to deal primarily, for sweating will 
not occur under these conditions. In this sense it is less comprehensive than effective 
temperature, but it is a more accurate index of physiological effect under conditions 
of thermal stress.” 


Finally it is worth mentioning that the relation between the ob- 
served body-temperatures and pulse rates at the end of the experiments 


» 
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and the four-hour sweat loss predicted from the nomogram has been 
used tentatively to define the upper tolerable levels of warmth (those 
levels above which an increasing number of men will fail to complete 
their work) for working men in the same state of acclimatisation as the 
Singapore subjects and engaged in activities involving a similar rate 
of energy expenditure. This indicates that these levels are reached 
when climate/work/clothing combinations correspond to a four-hour 
sweat loss of 3.5 litres predicted from the nomogram of McArdle et al 
(1947). 


5. CONCLUSION 


I have tried to illustrate by an example of work done at Singapore 
a few aspects of a very wide subject. 

The experiment described was one on men naturally acclimatised 
to the tropics and made possible a comparison of their performance 
with that of men artificially acclimatised in London. Subsequent 
experiments of similar type at Singapore tested the performance of 
men artifically acclimatised with a similar routine to that used in London 
and of ship-acclimatised seamen who were brought to the laboratory 
for a single exposure to work at high temperatures. 

There was not a great deal of difference in the levels of response to 
work at high temperatures of the artificially acclimatised men in London, 
artificially acclimatised men in Singapore, naturally acclimatised men 
exposed twice a week in Singapore and ship acclimatised men in Singa- 
pore, although the latter two groups were, if anything rather less tolerant 
than the highly trained laboratory subjects in the first two groups. On 
the other hand, Dr. J. S. Weiner has repeated the Singapore experiment 
_on ship-acclimatised men with completely untrained subjects at Oxford 
to show that completely unacclimatised men in England are very 
much less tolerant than any of these groups. It seems fairly certain 
that for short exposures to work at high temperatures, conclusions 
derived from the study of heat acclimatised subjects in London are 
broadly applicable to heat-acclimatised men working at high tempera- 

in the tropics. 
ints at Singapore examined the effects of radiant heat 
very carefully. There were trials with living subjects ; also some interest- 
ing cross-checks on the calculation of mean radiant temperature were 
obtained by the use of artificially constructed metal men. 

Further, an extensive series of trials was carried out byeMir Rab 
Peplar, the psychologist to the unit, on the effects of high temperatures 
on mental performance. However, these do not fall within the limits 


of the present discussion. 
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_ The Tropical Research Unit at Singapore was directed by Surgeon 
Commander F. P. Ellis. To him and to Dr. R. K. Macpherson, the 
physiologist to the unit, I am much indebted for the interest they have 
given me in this work; but much more important is the indebtedness 
to them of the subject of Climatic Physiology. 
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CONFIDENCE LIMITS FOR MEASURING THE PRECISION 
OF BIOASSAYS* 


C. I. Briss 


The Connecticut Agricultural Experiment Station and Yale University, New Haven, 
Connecticut, U.S.A. 


An experiment for determining how much of one preparation, the 
“Unknown”, is needed to produce the same reaction in living material 
as a stated amount of a second preparation, the “Standard”, is known 
as a bioassay. Bioassays may be divided experimentally into two 
types, (1) those where the dependent variable is a threshold dose 
measured directly in each test animal, and (2) those where the dependent 
variable is the size of the reaction to dosages fixed by the experimenter. 

In the few assays of the first type, each threshold dose can be trans- 
formed to its logarithm and the log-relative potency (M’) computed 
either as the difference between two mean log-doses, or as a mean differ- 
ence. Both have confidence intervals that are well-known and simple. 
In the many assays of the second type, the difference in the response to 
the two preparations must be converted to units of dose. If the response 
plots linearly against the log-dose, the difference between the two mean 
responses is divided by the common slope of the log-dose response 
curves for the Standard and for the Unknown to obtain M’. If the 
response gives a straight line with arithmetic dosage units, potency is 
computed instead from the ratio of the slope for the Standard and that 
for the Unknown. In either case, the log-potency or potency depends 
upon the ratio of two statistics. The confidence limits for a ratio are 
more complex than those for a difference. ' 

In the form proposed by Marks (Fieller [1944]) for balanced cross- 
over assays, and applied later by Gridgeman [1951] to other factorial 
assays, the confidence or fiducial limits of a ratio are not difficult to 
compute. With little loss in simplicity, Marks’ equation can be ex- 
tended to all assays based upon the ratio of two statistics, and accord- 
ingly it has been adopted in U.S.P. XV [1955]. The purpose of this 
paper is to review the general equation and consider some of its exten- 
sions in sufficient detail to facilitate their understanding and use by 
the practising bioassayer who is not a professional statistician or 
biometrician. Since the calculation of each U.S.P. assay, including its 
confidence interval has been illustrated elsewhere (Bliss [1956a]), the 
numerical examples are restricted here primarily to non-official assays. 
Seen PRS RM OMEN ae VET Lith RPI LOSS SPO ee SE nd te 

*Published with the aid of an educational grant from the U. 8S. Pharmacopoeial 
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CONFIDENCE LIMITS FOR ASSAYS BASED UPON A DIFFERENCE 


In the assay for digitalis, the threshold dose is determined directly. 
A test solution is injected slowly into the alar vein of a pigeon until its 
heart stops, the amount injected being the threshold dose. This is 
determined in six or more pigeons with the Standard (S) and again 
with the Unknown (U). When thus measured directly, the log-threshold 
dose follows a normal distribution (Bliss [1944]). If each dose is trans- 
formed to its logarithm z, the log-relative potency M’ of the Unknown 
is the difference between the two mean log-doses or 


ME = 75 fy (1) 


The variance of a single zx is computed from the Ns and Ny log-doses 
for each preparation as 


= BS x” — Oi (as)/Ns — De (tvy)/Nu}/n (2) 


with n = Ns + Ny — 2 degrees of freedom. The confidence limits of 
M’ are given by the familiar equation 


XG Var St ae 
where ¢ is the tabular value of Student’s ¢ at P = 0.05 for n degrees of 
freedom and L is the length of the log-confidence interval. 

For the threshold dose of digitalis in pigeons the standard deviation 
is so small relative to its mean, that the log transformation can be 
omitted and essentially the same potency obtained directly from the 
ratio of the two mean threshold doses. This is the calculation given 
in U.S.P. XV. Although potency is computed more easily as a ratio, 
its confidence interval is more complex, as will be shown later. The 
anti-logarithms of the simpler limits in Equation (8) agree closely with 
those for the ratio of two mean threshold doses (Bliss [1956a]). 

A second drug assayed from the threshold dose is tubocurarine. 
A test solution is injected slowly into an ear vein of a rabbit until its 
head drops, the amount of tubocurarine injected being the threshold 
dose. The rabbit is revived and tested again the following day. Half 
of the rabbits, f in number, are injected first with the Standard and 
then with the Unknown, and the remaining half in the reverse order. 
The metameter x in each rabbit is the difference of the log-threshold 
dose of the Standard minus the log-threshold dose of the Unknown 
so that here the log-relative potency of the Unknown is the Tape 
difference for the 2f rabbits, or M’ = >> x/2f = @. The error variance 
of a single x is computed from the variation within groups as 


f= {ir —-(T+72/f}/n (4) 


= M’ +iL (3) 
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where 7, = >. 2 for the rabbits injected first with the Standard, 
T, = >. x for those injected in the reverse order, and n = 2f — 2. 
The length of the log-confidence interval for a mean difference based 
upon 2f values of x is the standard error of M’, sy, , multiplied by twice 


the tabular value of Student’s ¢ at P = .05 for n degrees of freedom, 
giving 
hs 2st/ V 2f = 28a t (5) 


with limits Xy, = M’ + 


bole 


L. 


CONFIDENCE LIMITS FOR A RATIO 


In most bioassays, relative potency is computed from the ratio of 
two statistics. The sample preparation or Unknown (U) is assigned 
a provisional potency in units of the Standard (8). Both preparations 
are then tested at similar dosage levels in terms of their assumed and 
known potencies. The objective is to adjust this provisional potency. 
In a parallel-line assay, the response to both preparations is expressed 
in units y, which, over an adequate working range, can be plotted 
against their respective log-doses x as parallel straight lines with a 
common slope 6. The difference between the two mean responses, 
Yu — 9s = a, is divided by the slope 6 to convert a to units of log-dose. 
The log-relative potency then has the simple form 


M’" = Gu — Gs)/b = a/b (6) 


Both a and 6 are statistics or estimates subject to sampling errors. As 
the ratio of two statistics, M’ is also a statistic. It is an estimate of 
the true relative potency of the Unknown, which we will call u, and since 
it is only an estimate, we need a measure of its precision. 


The basic equation. 


The precision of M’ is measured in terms of the interval L between 
an upper and a lower confidence limit. These limits are so spaced 
above and below M’ that, on the average, they will include the true 
log-relative potency u in a selected proportion of assays, commonly in 
19 out of 20 (1 — P = 0.95). Since the responses y are normally dis- 
tributed, the statistics a and b from which M’ is computed are also 
normal, with variances Ug, and v,, and covariance v,, . Moreover, 
a and b are unbiassed estimates of the average difference in response 
and of the average slope, respectively. It follows that the difference 
a — ub, in unlimited repeated assays, is also normally distributed with 
a mean of zero. The variance of the difference a — yb is then 


Ve tee == 20055 bss 
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These definitions lead readily to the confidence limits for » as the 
roots of a quadratic equation (Fieller [1944]). Since the variances of 
a and b in any given assay are based necessarily upon limited informa- 
tion, the distribution of the ratio (a — ub)/ VV is that of Student’s ¢. 
Confidence limits for » are obtained by solving the inequality 
—t < (a — pb)/ VV < t for y, or the equivalent equation 


(a — pb)? — fV =0 

Usually, ¢ is the tabular value for P = 0.05 of Student’s ¢, with appro- 
priate degrees of freedom, where P is the proportion of assays in which 
the confidence interval does not include yw. Substituting for V its 
definition, we have the quadratic equation 

(bv? 5 Uspt’) = 2u(ab = Vapt') + (a’ =e Vaal) = 
Solved for up, the roots of this equation are the required confidence 
limits, XG , or 
ab — v,,¢° (ab — v,,t°)” — (a” — v,,t7)(b° — r,t”) 
b? sae Vase (b” a Vsst ) 


Xa — 
This equation has appeared in a variety of forms (Bliss [1945]; Finney 
[1952b]). For conversion to that proposed by Marks (Fieller [1944]) let 

C= b?/(b? an Uspt ) (7) 


from which C — 1 = 2,,/°C/b’.* Substituting C and M’, the above 
equation for X. reduces algebraically to 


Xu = CM’ — K + V(C — 1)(CM” + 0,,/v) + K(K — 20M’) (8) 


or 


Xap OM IK dl 


where K = (C — 1)v,;/vy, and L is the length of the confidence interval. 
When the numerator and denominator for M’ = a/b are independent, 
so that v., = 0, all expressions involving K vanish and Equation (8) 
simplifies to 


Xu = CM’ + V(C — 1(CM” + 0,,/v.) = CM’ + 1L (9) 


Although a, b and M’ here have specific meanings, these equations can 
be applied to the ratio of two statistics with other definitions for a and b. 

The larger the inequality b” > »,,¢°, the more nearly C approaches 
unity and the shorter is the interval L. L is a convenient index to the 
inherent precision of an assay but not always to its accuracy, since the 


; *To avoid confusion with other terminologies, note that C in Equation (7) is equivalent to C2 
in publications by the present author [1945, 52] and to 1/(1 —g) in those by Finney [1952a, b]. 
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potencies computed from independent replicate assays may differ more 
than would be predicted from their individual confidence intervals. 
When such a discrepancy occurs, it may be due to chemical instability, 
lack of uniformity in the drug, differences in moisture content, or some 
other variable that changes between but not within assays. Fortunately, 
most assays are either free of this complication or not sufficiently precise 
for it to be detected. 

The log-relative potency M’ from Equation (6) is easily converted to 
the log-potency M by the equation 


M = M’+ 4s — Zy = M'+ logR (10) 


where Zs and Z,y are the mean log-doses of the two preparations. Their 
difference in a balanced assay is equal to log R, where R is the ratio of 
a given dose of the Standard to the corresponding dose of the Unknown. 
fs , fy and RF are assumed to be controlled measurements with negligible 
sampling errors. To obtain the confidence limits of 1 from those of 
M’, we need only add @s — Zz , or log R, to the upper and lower limits 
of M’, Xx. 


General confidence limits without covariance 


Wherever possible, parallel-line assays are designed so that a and b 
in Equation (6) are independent and the confidence interval can be 
based upon Equation (9). Several Unknowns, h in number, are some- 
times assayed concurrently against the same Standard. The entire 
assay may then be computed as a unit, on the assumption that the 
h + 1 preparations act similarly, in terms of the response y. If true, 
their log-dose response curves will have the same slope b within the 
experimental error. When the assay slope can be computed from the 
combined evidence of several preparations, its variance and C’ are 
reduced correspondingly. 

To compute the assay slope, the following terms are determined 
from the k& log-doses x for each preparation: 


[7] = > Oe) — 27 Ga)/N: 
and 
[zy] = pS (1) E, (fz) YS T./N; 


where T; = >, y at each 2, f is the frequency or number of y’s in each 
treatment total T, , and N; = >> f for a given preparation. These 
sums of squares and products are totalled over the h + 1 preparations 
to obtain the assay slope 


b= LiL (11) 
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with which each M’ is computed by Equation (6). The variation in 
y accounted for by 6 is 


B= 8 Di) = LD la/Dd le’ (12) 


If the responding units have been assigned to each treatment 
entirely at random and there is ample replication, the error variance 
of the assay may be computed from the variation of the N values of y 
within treatments as 


f= (ey eT) t (18) - 


where the degrees of freedom n = N — > k. The variance of the slope 
bis 


Vo, = Cay ae [x] 


and from Equation (12), b? = B’> [x’]. Substituting these terms in 
Equation (7), we have 


C=B/(BR’ = st) (14) 


where / is the tabular value of Student’s ¢, usually at P = 0.05, for the 
degrees of freedom n in s’. 

One other term is needed to solve Equation (9), namely the ratio 
Vaa/Ys» - Lhe variance of the difference 7y — Js = a is computed 
with the error variance of the assay (s”) as 


a) oes; 
rw = 8 -+ 54} 


From the variance of the slope v,, , defined above, the ratio 


aby {ge ae x > 2’) (15) 


Note that this ratio depends only upon the design of the assay and not 
upon its outcome. 


THE CONFIDENCE INTERVAL FOR SELECTED PARALLEL-LINE 
ASSAYS WITH A GRADED RESPONSE 


Since they depend only upon the design, the constants in a con- 
fidence interval can be determined in advance, a considerable advantage 
if the same design is to be used repeatedly. Those discussed below are 
factorial assays, involving the two factors of preparation and dosage 
level in balanced arrangements, with the same number of responses for 
each “‘treatment’’ or dosage level of a preparation. Factorial designs 
facilitate both the calculation of potency and tests of the underlying 
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assumptions of parallelism and linearity. In the absence of more reli- 
able estimates, the error variance may be based upon discrepancies 
from these assumptions. 


Balanced factorial assays with one Unknown 


The simplest factorial assay involves two preparations, the Standard 
and one Unknown, each at k dosage levels, which, in turn, are spaced 
at equal intervals 7 on a logarithmic scale. The test units, which may 
be hypophysectomized rats, blood pressure readings, tubes inoculated 
with micro-organisms, or some other biological indicator, are assigned 
at random, but in equal numbers, to each dose of each preparation. 
If a value is lost during an assay, it is replaced at the start of the calcu- 
lation, as described later. The f responses (y) for each treatment are 
totalled to obtain the 7, ; these 7, lead directly to the log-potency 
and its confidence interval. 

The 7; are multiplied by the factorial coefficients x, and x, in 
Table 1 and the products summed to obtain T, = > (oe. 7) and 2 
> (z,T.). The sums of the squares of these coefficients are e, = 
>> 22 = 2k and e, = >> x3 = 2 >> (xi), where the z,’s are the coded 
log-doses for a single preparation. The difference between the mean 
response for the Unknown and that for the Standard is 7y — Gs = 
2T./fe. = T./fk. The combined slope of their log-dose response 
curves is b = T',/fe,i’ = 6T,1'/fk(k’ — 1)7”, where 7’ is the log-dosage 
interval corresponding to an interval of 1 in the coded units z,. When 
the number of dosage levels k is odd, successive values of x, differ by 1 
and 2’ = 7; when k is even, successive values of x, differ by 2 and 7’ = 2/2. 
Substituting in Equation (6), the log-relative potency of a balanced 
factorial assay is 

M = cl,/T; (16) 
where c = 2¢;i'/e,i = (k° — 1)i/62’, as given at the end of Table 1. 

Since > Z,t, = 0, T, and T, in Equation (16) are independent, 
and the confidence interval for M’ is based upon Equation 9. In 
calculating C, B’ in Equation (12) reduces to B’ = Ti/e,f, or C may 
be computed directly as 

C = 73 /(T; — afst) aye 
The ratio v,./:, in Equation (15) may be written as c's”, where c’ = 
ce,/e, or, as Gridgeman [1951] has shown, c’ = (k? — 1)/3; the values 
of c’ for k = 2 to 6 are given in Table 1. The length of the confidence 


interval then is 
L = 2V(C — 1)(CM” + c’2”) (18) 
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TABLE 1 
FactoriaL Corrricients x ror ANALYZING A BALANCED Broassay, In WuIcH 
Successtve Loc-Dosus or STANDARD (S:) AND or SamPLE oR “UNKNOWN” 
(U;) ane Spacep Equatty, Eacu wirn Tue Samp NuMBER (f) or Responses 
ToTatuine 7; 


Factorial coefficient x for each dose 
ee ee A UAE 
Design | Row | — 
Si uiSa Say Sepa Sewrse Ur, Us. Up Uiills AT e 
Disa a eee 


DS) a -1 -1 boat} A 
b -1 1 -1 1 4/7; 
ab 1-1 , -—1 1 Ae Ten 

os Pues a eee oe | a ae 3 

3,3 a —1 —-1 -1 Lie iom 1 Cnige 
b 1-0. i -1 041 4/7; 
ab (Ros pea = Pele) Soe ee 
q ee ae | 1-2 1 124075 
OG 21 Bt se Aa | te 

a ee re ee eth Oe ae kT peta 

4,4 a —1 -1 -1 -1 gE Ane le ye) si 6 
b ae even TR PS 40 | 7. 
GORE le i 3 ead eed eS “oe foe | Gals 40 | Tas 
Pear tae a BS a iy Woe | To ae Side 
it NE oh Rg | 1-1-1 J] Stee 

wi See A feet Rores Bes yoe REY CeO Stal sae et. pal py i 

ee Ie gg eae ee de OT, 20.| 7. 


1 
2 
pera nd Bhar PaaE ee ay Sly thor 
Pct a ae 2 - oe) 

1 : 2 
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The factorial coefficients in rows ab, q and aq of Table 1 are for 
testing the validity of the assay. The treatment totals are multiplied 
in turn by the coefficients in each row and the products summed to 
obtain 7’; = Se (x;T,). The ratios in these three rows, T’;/e;f, should 
not exceed the error variance s” significantly. That from row ab tests 
whether the log-dose response lines are parallel; those from rows q 
and ag test their linearity. If any ratio exceeds s? as much as three- 
fold in an assay with three or more doses, the validity of the assay may 
be tested by computing the variance ratio, F = >> (T?/e,f)/3s", for 
comparison with the tabular F for ny = 3 and n, = degrees of freedom 
ins’. 

The U.S.P. assay for Adrenal Cortex Injection is a two-dose factorial 
assay with the ratio of the two dosage levels fixed at 5:3 or 7 = 0.222 = 
2/9 approximately. Its log-relative potency can be computed as 
M’ = 2T,/9T, , and L determined with cc” = 0.04922. An example 
of a three-dose U.S.P. assay is that for Corticotropin Injection, where 
the log-interval z is selected by the experimenter. 


Partially-balanced factorial assays 


A bioassay is based upon the central part of the log-dose response 
curve, where the response y is substantially a linear function of the 
log-dose x. If the assumed potency of an Unknown is too high or too 
low, the response to its smallest or largest dose may approach a lower 
or an upper limit, so that the Standard and Unknown no longer give 
parallel straight lines over all doses. Dropping an end dosage level 
of the Unknown may restore assay validity but sacrifice the initial 
balance in respect to k. 

Whenever the two preparations differ by one in the number of 
- dosage levels and the log-dose interval 7 is constant, the assay can be 
computed and its validity tested with the factorial coefficients in Table 2, 
proposed by Wood [1953]. If the Unknown has the larger number of 
dosage levels, the coefficients for the Standard and the Unknown are 
interchanged and their signs in rows a, ab and aq reversed. The log- 
relative potency M’ and the confidence interval L are computed as 
before with Equations 16 and 18, but with c = e/>)| 2. | and c’ = 
c’e,/e, , a8 at the end of the table. Because of the unequal number of 
dosage levels, M’ is converted to M with the first form of Equation (10). 

An example is provided by an assay of the vitamin C activity of 
fresh orange juice (Bliss [1952a]). The juice or Unknown was analyzed 
chemically for its ascorbic acid content, and on this basis fed to depleted 
guinea pigs at 0.5, 1.0 and 2.0 mg. of ascorbic acid per day. Pure 
ascorbic acid at the same dosage levels served as the Standard. When 
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TABLE 2 
FactortAL Comrricients x FOR ANALYZING A ParTIALLY BALANCED ASSAY, IN 
Wuicu Successtve Log-Dosus or STanpDARD (S;) AND OF SAMPLE OR 
“Unknown” (U;) ARE Spaced Equatiy, Each WITH THE SAME 
Numper (f) or Responses ToTatiine T; 
If the number of successive doses of the Sample exceeds by one the number for the 


Standard, interchange S; and U; in the heading and reverse all signs in rows a, 
ab, and ag. (Wood [1953)) 


Factorial coefficients x for each dose 
Design | Row Cs T; 
S; So S3 S4 Ss S6 U, U2 U3 U4 Us 
2,1 a —1 -1 2 Galle 
b —1 0 DAA AB 
3,2 a —2 -—2 -2 ay BY SON ela 
b == ge —-1 1 LOGS 
ab 1 O-1 —2 2 VOM Le, 
q 12 1 Oe Galilean 
4,3 a —3 -—3 -—3 -3 AGT 2A oA. 84 | T. 
b —3 -1 1 8 =—250' <2 287) Ls 
ab 3) leis ==HteeOus 5 Om ean 
q a Selves a 2-4 2 60 | T, 
aq =! f- t—1 1-2 I WO |] eee 
5,4 a —4 —4 —4 —4 —4. Om Oe ame) 180 | To 
b —4-—-2 0 2 4 Saw el Be ass 60 | 7's 
ab 2° & 07 —1\=—2 eats eed Ca gl Bea SU alba 
q 2-1-2-1 2 1-1 —1. 1 VES af 
aq —4 2 4 2 -—4 vf —7 —7 4 252 dia 
6,5 a —5 —5 —5 —5 —5 —5 6G Garo el65 FP OsleooO mane 
b —5 —3 —1° 1 3 Seas 2 On es SA AONE 
ab 10 6 2-2 -6 -—10/-14 —7 0O 7 141 770 | Ta 
q 10 —2 -—8 -—8 -2 10 OF Se Go ome O seLOmelileg 


ag 15-6 ) 1 4064 21. Noel Rees Pe Chg 


CoMPUTATIONAL CONSTANTS 


For Equation | Constant Value in design 
computing 


2,1 3,2 4,3 5,4 6,5 


M' (16) c 1/2 5/6 7/6 3/2 11/6 
L (18) c’, 3/4 25/12 49/12 27/4 121/12 


ee ee ae ee eee 
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computed as a 3, 3 assay, a significant departure from parallelism 
(P < .05) could be traced to a deficient response at the highest dose 
of the Unknown. With this one omitted, the remaining five doses 
gave the satisfactorily parallel 3, 2 assay in Table 3. 


TABLE 3 
A 3, 2 Assay In Guinea Pies or THe AntTIscorBUTIC ACTIVITY OF ORANGE JUICE 
(U) Rexative to Pure Ascorric Acrp (S) 
Response y = length of the odontoblasts in the incisors. The y at U; in the original 
3,3 assay (7, = 260.6) gave a significant departure from parallelism and has been 
omitted in recomputing the potency. (Bliss [1952b]) 


Term Coefficients x for dose SS T; Variance 
Si S2 Ss U; U; > 2? Ti/ fes 

a —2 —2 -2 3 3 30 62:95 = 77 

b —2 0 2 —-1 1 10 SS of ee 1th 

ab 1 Oo -1 —2 Be 10 Be ae 8.84 

q 1 —2 1 0 0 6 pe = 9.13 

dis 79.8 166.5 260.6 131.9 227.0 Error (n = 45), s? 14.1080 


t = log 2 = .30103, c = 5/6, M’ = .03455 (Eq. (16)) 
Zs = 0, Zy = —.15052, M = .18507 (Eq. (10)), f = 10 
2 = 4.057, C = 1.02822 (Eq. (17)), c’7? = .18879 
3L = .07323 (Eq. (18)) 


Factorial assays with unequally-spaced log-doses 


When equally-spaced log-doses are not convenient, as in some micro- 
bial assays, factorial coefficients can be developed for other sequences. 
The more evenly the selected log-doses cover the range, the simpler 
are the coefficients. The linear coefficients x, are small whole numbers 
with approximately the same spacing as the log-doses x and an average 
ratio between successive values of i’ = > {a.(e — #)}/D > ai. Any 
given set of three or more linear coefficients x, can be matched with 
orthogonal quadratic coefficients z, for testing curvature, as described 
elsewhere (Bliss and Calhoun [1954)). 

This design has been applied to microbial assays for the vitamins 
(Bliss [1956b]). With test solutions of a suitable concentration and 
dosages of 1.5, 2, 3 and 4 ml. per tube, the turbidimetric response to 
vitamin B,, , for example, in terms either of (100% transmittance) or 
its logarithm, has usually defined a straight line when plotted against 
the log-dose. The corresponding linear coefficients are 7; = —29, 
—12, 12 and 29, and the quadratic coefficients are x, = 1, —1, —1and1. 
The constants in the equations for computing M’ and L are derived as 


je 
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before; for the above sequence of four doses they are ci = 2%’¢,/e, = 
7.2332 and c’”” = (ci)’e,/e, = 0.10623. Similar coefficients have been 
determined for other series of 3 to 6 doses. 


Assays with more than one Unknown 


Two or more Unknowns may be assayed concurrently against the 
same Standard at equivalent dosage levels. Although the potency of 
each Unknown and its confidence interval can be determined inde- 
pendently, this both increases the amount of computing and sacrifices 
some of the precision inherent in the data. The Standard and the several 
Unknowns commonly will have linear log-dose response curves with 
nearly the same slope and the same error variance about these lines. 
Any questionable discrepancies can be tested for significance and the 
discordant preparations omitted. With more degrees of freedom, the 
assay variance (s”) and slope (b) have greater stability, and C will be 
computed with a smaller ¢’. Confidence intervals determined with 
these composite values, or their equivalent, average less than if based 
upon only part of the relevant data. 

The assay slope is computed by Equation (11), which for convenience 
may be written as 


(ty SPY ADS Be a LY ae Be Se (19) 


where 7} = >. (zx,7',) for each preparation, 7’ is the dosage interval in 
logarithms (x) corresponding to a unit difference in 2, , and +S f is the 
sum over all preparations of the number of observations at a single 
dosage level. By analogy with Equation (16), the variation in y attri- 
butable to this 6 is 


Bie Delile > Ria Do Typ ys a a af 
If treatments have been assigned to the test animals or other units 
entirely at random, the error variance for the assay may be computed 
by Equation (13) from the variation within treatments. Alternatively, 


the variation of treatment means about the fitted parallel lines may be 
included in the error, by calculating 


v= {Ly — Lewes — BY3/n (20) 


For each preparation, T’ = >> T, , summed over k dosage levels; the 


___ degrees of freedom n = N — h. Given B’ and s°, the slope factor C is 


computed with Equation (14). 
The number of responses f at each dose may be constant or may 


vary from preparation to preparation. If f is constant, M’ and L are 
computed with Equation (16) modified to 


M’ = cih'T,/2 ye TS (21) 
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and with Equation (18) adjusted to 
L = 2V(C = 1)(CM" + ch’ /2) (22) 


where h’ = h + 1 is the number of preparations. In an assay with one 
Unknown, h’ = 2, >> Ti = T, and Equations (21) and (22) reduce to 
their original form. If f is not the same for all preparations, M’ may 
be computed by Equation (6) with b from Equation (19). JZ is calcu- 
lated by Equations (9) and (15) with >> {2’] = 7”e, > f/2 = 
ae ae Som 

Most microbial assays include several Unknowns as in the random- 
ized vitamin B,, assay in Table 4. The reading for each tube has been 


TABLE 4 
A TurRBIDIMETRIC Assay IN THREE RANDOMIZED SETs OF THE VITAMIN By. AcTIVITY 
or THREE Unknowns (U, U’ anp U”’) at Four Dosage LEvELs, IN ML. OF 
Txrst SOLUTION PER TUBE 
Data from the Food Research Laboratories in a U.S.P. collaborative study, Decem- 
ber, 1954. 


A: Individual responses, y = 100 log (100 —% transmittance) — 100 


S for dose U for dose U’ for dose U"' for dose Total 
Set ——_———_ ————_____———] T, 
fo ee eel abe to oil eee. <4 lr ee: eee 


ii 43 48 54 62/45 45 54 62) 41 46 56 59) 41 46 51 63) 816 
II 38 49 56 62) 43 46 56 63) 40 45 53 61) 40 48 52 60} 812 
III | 38 45 54 61) 41 46 54 60) 34 45 51 54) 40 48 52 60) 783 


T, |119 142 164 185/129 137 164 185) 115 136 160 174) 121 142 155 183) 2411 


B: Calculation of M’ and 3L 


Treatment total 7, for dose | Total ey M' 4L 
Prepn. }_-————_|_ 1’ |_| 7. (Eq. (21))|(Eq. (22)) 
18 2 2 4 eels 
S 19 142 164 20185 610 | 2178 —2 
U 129 137 164 185 | 615 | 1948 13) 5} .00895 .0443 
Be 115 136 160 174 585 | 1999 —7|—25|—.04477 .0445 
(gfe 12E 142. 15d. 183.5) G01, 1954 7| —9|—.01612 .0443 


iF 
——$— | 


Sum 484 557 643 727 | 2411 | 8079 11) M’ = .0017906 T. 


X —29 ~—12 12 29 | ci = 7.2332, h’ = 4, ch! /2 = .21246 
Ee) a — 1 1 | s? = 5.2189, n = 11, # = 4.844, C = 1.00924 


ee SS SS SS 
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coded to obtain the individual responses y. If the assay had not been 
randomized, as is often the case, the error variance would be based upon 
the variation of the 7',’s about parallel, straight, log-dose response 
curves and this procedure has been followed here. In consequence, 
s? has been computed with >> y’ in Equation (20) replaced by >) T:/f 
and with n = hk — 1 = 11 degrees of freedom. The coefficients 2, 
and x, were determined by the unequal spacing of the log-doses. The 
values of 7% = >> (x.7',) indicated no consistent simple or quadratic 
curvature over the four preparations. The log-relative potency M’ 
and its half-confidence interval (4L) have been computed for each 
Unknown with Equations (21) and (22). 


Assays in balanced pairs 


When responses can be arranged in homogeneous pairs, the most 
efficient assay is a two-dose factorial twin cross-over design (Smith 
et al [1944]). Half of the pairs are given doses S, and U, and the other 
half S, and U,. Since a time or order sequence is usually involved, as in 
the insulin assay, the order of injection is reversed in half of the animals 
receiving each pair of doses, so that all four treatments are given on 
each test day. Four equal groups are injected in the following order: 


Group or pair 
Item 

i 2 3 4 

1st dose Se Si Us Ui 

2nd dose Ui ‘Us Si S2 
Difference for each y S:—U, Uz—-S: Ue—-S: & —Ui 

Response variate, y Yi Y2 Y3 Y4 

Total of f response, 7, T, Ts Ts Ts 

Factorial Ae Vse Orta —1 1 uy —1 

coefficients Slope, a» 1 1 1 1 

for Days ; 1 —1 1 —1 

Residual —1 -1 i! 1 


The unit of response y in a twin cross-over assay is the difference 
between the two paired reactions, within each pair subtracting the 
reaction to the smaller dose from the reaction to the larger dose. Any 
losses during the assay are adjusted before totalling the y’s in each 
group to obtain T, = 7, to T,. From the definitions in Table 1 for 
a 2,2 design, 7, = —T,+7,+ 7, — T,andT, = T,+ Lee, +7; 
from Equation (16) the log-relative potency M’ = iT./T, . The 
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contrasts for the remaining two degrees of freedom among the four 
totals 7, to T, measure directly the change in average sensitivity from 
the first to the second dose, and indirectly the deviation from parallel- 
ism (Finney [1956]). 

The error variance s* for computing the confidence interval depends 
upon the design of the assay. If each y represents a different animal, 
as in an insulin assay, and all are assigned at random but in equal 
numbers to each group and tested concurrently, the error variance of a 
single y is computed with Equation (13), where 7, = 7, to T,;. In calcu- 
lating the confidence interval, C is determined with Equation (17) and 
e, = 4, and L with Equation (18) and c’ = 1. 


PARALLEL-LINE ASSAYS WITH POTENTIAL COVARIANCE 


In most but not all parallel-line assays, the design insures that the 
numerator of M’ will be independent of its denominator. Three con- 
ditions will be considered which may give rise to covariance in parallel- 
line assays. In contrast, covariance between numerator and denomina- 
tor is always present in slope-ratio assays. 


Assays in randomized sets 


Not infrequently, the responding units in an assay can be arranged 
in advance of treatment into relatively homogeneous groups or sets, 
such as of litter mates in a vitamin D assay (Bliss and Gyorgy [1951]) 
or repeated doses of drug in the same patient in a clinical assay (Bliss, 
[1952a]). Successive smooth muscle contractions in a histamine assay 
(Schild [1942]) have been treated similarly, although these involve no 
inherent discontinuity which coincides with the subdivision into sets 
and alternative adjustments for changes in sensitivity may be more 
effective (Finney [1956]). Each set contains as many units as there are 
treatments and within each set one unit is allotted at random to each 
treatment. In effect, the Unknown is assayed separately in each set, 
and a separate M’ could be computed for each if required. 

The error variance s’ is usually determined from the interaction of 
the k treatments by the f sets as 


Sut Mace eel a time ga ad, kf )/7 (23) 


where T,, is the treatment total, 7’, the set total, 7 = >) T, = SG 
and n = (k — 1)(f — 1). Finney [1952c] has noted, however, that this 
interaction could contain variance components which increase the 
apparent error of an assayed potency but not its real error. 

His reasoning may be summarized geometrically. In an analytical 
assay, the potency of the test solution of an Unknown relative to that 
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of the Standard is presumably the same in all sets, so that the expected 
value of M’ is constant. If two parallel log-dose response lines are 
fitted to the data of each set, the expected distance between them on 
the x axis (M’) is also constant. However, sets that are unequally 
sensitive to the drug may differ significantly in the slopes of these lines. 
If the horizontal distance M’ is to remain constant, a steeper slope (b) 
necessarily increases the vertical distance (a), leading to a positive 
correlation between the numerator and denominator of M’ = a/b. 
A significant variance component for slope would enlarge the inter- 
action of treatments by sets in Equation (23) but would not affect the 
precision of the assay. 

The presence of this effect is easy to test in a factorial assay. The 
responses y in each randomized set are multiplied by the factorial 
coefficients for the numerator (x,) of M’ and by those for its denominator 
(x,). The sums of these products are the paired values y, = > (ey) 
and y, = >, (a,y), respectively, from which > y. = T,and >> y, = T, 
in Equation (16). From the f values of y, and y, , the variances and 
covariance are 


Vy.) = (0 y2 — T2/f)/nen 
Cov (Yayo) = (D2 YoYo — ToT s/f)/nV ec (24) 
Viys) = (do vs — T3/f)/ne, 


where n = f — l,e, = >) 2, ande, = >, 2;. The two variances, 
V(y.) and V(y,), have different expectations but both contain the 
postulated variance component for slope in addition to the true random 
sampling error, as does the covariance Cov (y,%/;). 

The Cov (y.y,) can be tested for significance in terms of the cor- 
relation coefficient r = Cov (yay.)/V (V(y.) V(ys). If significant, the 
confidence limits Xy, may still be determined with Equations (18) or 
(22) by computing C with a corrected error variance 6°. To estimate 
é° the variances V(y,) and V(y,) in Equation (24) are removed from 
the usual error variance s” in Equation (23), giving 


HAC Ge Det Hf he Via ee ee 


with n = (f — 1)(k — 3) degrees of freedom. If the slope varies sig- 
nificantly from group to group, é” should be smaller than s’. 

In some assays, the covariance in Equation (24) has proved significant 
(Leech and Grundy [1953]); in others, non-significant and negligible 
(Bliss [1952a,b]). One or both variances, V(y,) and V(y,), may be 
smaller than ¢°, which would favor computing C with the simpler error 
variance in Equation (23). In an assay with several Unknowns, such 
as in Table 4, the interaction of sets by preparations and of sets by 
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slope may be compared with the remaining interactions of sets by 
treatments, most conveniently by an analysis of variance (Table 5), 
If the mean squares for the two doubtful interactions were to exceed 
that for the remainder, either significantly or with an F approaching 
significance, é¢> would be more appropriate for computing C. The 
log-dose response curves for the several preparations may be tested for 
parallelism and for linearity as part of the analysis. 


TABLE 5 
ANALYSIS OF VARIANCE OF THE MIcroBIAL Assay IN TABLE 4 
Row Source D.f. | Sum of squares | Mean square F 

1 | Among sets 2 40.542 20.271 

2 | Among treatments (15) (2861 .812) 

3 | (1) Preparations 3 43 .396 14.465 @> BGs 
4 | (2) Assay slope 1 2761 .008 2761 .008 

5 | (3) Preparations X slope s 5.913 1.971 (Oe 25 
6 | (4) Non-linearity 8 51.495 6.437 CC) 2229 
7 | Sets X treatments (30) (82.125) (2.7375) 

8 | Sets X (1) 6 23.792 3.965 | (dd) 1.60 
9 | Sets X (2) 2 3.950 1.975 (d) .80 
10 | Sets X (3) 6 9.448 1.575 — 
11 | Sets xX (4) 16 44.935 2.808 — 
12 | Sets X (8) and (4) 22 54.383 2.472 — 


F values reported are computed with denominator mean squares from the following rows: (a), 
row 8; (b), row 10; (c), row 11; (d), row 12. 


In Table 5, neither the mean square for non-parallelism (row 5) 
nor that for curvature (row 6) exceeded significantly the corresponding 
mean square interaction with sets. When compared with the remaining 
interaction mean squares (row 12), those for sets with preparations 
(row 8) and with slope (row 9) did not approach significance. Hence, 
C could be determined with the overall assay error of s* = 2.7375 with 
n = 30, from row 7 or computed by Equation (23). Half-confidence 
intervals, recomputed for the three Unknowns with the new value of 
C = 1.00415, ranged from 3L = 0.0297 to 0.0298. 


Losses in balanced assays 


A balanced factorial assay has an equal number of responses at all 
dosage levels of each preparation. A loss during the assay must be 
repaired before the log-potency or its confidence interval can be com- 
puted with the equations for a balanced design. If f animals have been 
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assigned to each treatment group at random, a missing value may be 
replaced by the mean of the (f — 1) responses in the incomplete group 
and its treatment total 7, recomputed. If the assay consists of ran- 
domized groups or Latin squares, the missing value is replaced by one 
computed by the appropriate formula (Bliss [1952b]; Finney [1952b)). 
With these replacements, Equation (16) gives an unbiassed estimate 
(M’) of the log-relative potency, but its confidence interval needs 
further adjustment. 

The principal corrections are in the slope factor C. For every 
replacement, the assay variance s° loses one degree of freedom, which, 
in turn, increases #. Less obviously, a missing value increases the 
variance of 7, , if the treatment total 7, containing the replacement 
enters into its calculation. In effect, each treatment total is the treat- 
ment mean multiplied by f, i., T; = f(T./f). Tf T; contains a re- 
placement, it is equal instead to f(77/f’), where T/ and f’ are the 
initial total and frequency. As a consequence, the variance of 7’, is 
f? >. («3/f’)s’, and Cis increased. To avoid bias, it should be computed 
as 


CatiT, ff DG/ pst} (26) 


The larger the f, the smaller is the bias which this corrects. In a 
two-dose assay with a single Unknown and one replacement, for example, 
omitting the adjustment in Equation (26) would underestimate the last 
term in the denominator of C by the following percentages: 


f in complete groups 3 4 5 6 7 8 9 
Percent underestimate Taal That 5.9 4.8 4.0 3.4 3.0 


A missing value has less effect upon the variance of T, , which then 
becomes f’ >) (x2/f’)s*. In a two-dose assay, each 22 = 2x? , and the 
ratio Vza/Vss is identical with that for a complete assay. In all other 
factorial designs, c’ in Equation 18, or c’h’/2 in Equation 22, may be 
replaced by c’ >) (x2/f’)/>. (x3/f’). Here the bias may be either 
positive or negative but is usually neglected when f > 5. If a single 
value is lost from a three-dose assay of a single Unknown with f = 5, 
for example, neglecting the correction overestimates L at most by 1.4 
percent, if the loss is from an end dose, or underestimates it by 2.0 
percent, if the loss is from a mid-dose. 

In a factorial assay with a replacement, the numerator and denomina- 
tor of M’ may no longer be independent. The covariance of 7’, and T, is 


Cov (T.1:) = f? D>) (wams/f’)s" 
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from which K in Equation (8) is computed as 


K = (C — 1) D) (a.0/f9/ >> (2/f9 (27) 


The losses may be so balanced that > (e2/f’) = Oasina complete 
assay, so that the simpler Equation (9) applies. Even when not zero, 
>2. (v.2,/f’) is usually negligible. 

These corrections for lost readings have been examined for their 
effect upon a factorial assay of insulin with 6 y’s in groups 2 and 3, 
and 5 y’s in groups 1 and 4. The missing value in each of the latter 
two groups was replaced with the mean of the remaining observations 
before computing the log-relative potency M’ = —0.03522. The error 
variance with 18 d.f. was s* = 214.08. After correcting for replacements 
by Equation (26), C = 1.03767, instead of C = 1.03413 by Equation 
(17) with the same s*t*. The confidence interval computed with the 
corrected C but without covariance was L = 0.1177. With K = 
— 0.003424 by Equation (27), the confidence interval corrected for 
covariance was L = 0.1135, a difference of less than 4 percent. 


The moving average adjustment for changing sensitivity* 


Some assays depend upon successive reactions of a single animal or 
piece of tissue, such as the fall in blood pressure of an anaesthetized 
chicken following a dose of posterior pituitary extract. The reaction 
depends not only upon the dose of drug but also upon the condition of 
the animal preparation, which may change slowly or rapidly. The 
four doses in a 2, 2 factorial assay may be administered in randomized 
sets (Schild [1942]), or by pairing the treatments, as in the U.X.P. XV 
assay for vasopressin. Recently, Finney [1956] has proposed still 
another approach. 

A rather different design has been adopted for the posterior pituitary 
assay in U.S.P. XV. A single dosage level of the Standard (S) is in- 
jected at regular intervals, alternating with injections in a random order 
of a higher or a lower dosage level of the Unknown (U, and U,). “An 
assay consists of f = 3 or more sets or pairs of U, and Uz and the 
alternating one-level doses of Standard. If the sensitivity of the 
animal should change markedly, all doses following one of the Standard 
may be increased (or decreased) by a constant proportion, starting 
again with the Standard at its new level. 

For analysis, the successive reactions to the Standard are averaged 
in pairs to form a series of moving averages. Each average is subtracted 
from the intervening reaction to the Unknown to obtain a unit response, 
pees ie a ee 


*Referee comments have aided materially the revision of this section. 
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designated as y, for a low dose of Unknown and as y, for a high dose. 
Corresponding values of x are obtained similarly from the log-doses; 
they average zero (€ = 0) if S is exactly midway between U; and U, 
on a logarithmic scale. In computing M’, gv — 9s = (Ti + T.)/2f 
and b = (T, — T,)/if, from which 


M’ = {Ti + T2)/2(T, — T)} — (28) 


where 7, = > y: , Te = DS yz, and 7 is the constant log-interval 
between U, and U, . 

The error variance of an assay is computed from the y (the 
and Y2) as 


8 ={Yy-7i4+T)/f}/n (29) 


where n = 2(f — 1). Each y, however, is the difference between the 
reaction to a dose of Unknown and the mean of two adjacent reactions 
to the constant Standard. Assuming that the initial reactions are inde- 
pendent and have a common variance s; , s° = (1 + 4)s; or s; = 28/8, 
if we neglect the covariance between the y introduced by the moving 
average for the Standard. This covariance depends upon the number 
of times (r) the order of y, and y, is reversed in going from one set to 
the next. With no reversals (r = 0) it is zero and s; is unbiassed; with 
f = 3o0r4 andr = 2, the covariance exerts its greatest effect, and s” 
would be multiplied by 17/18 for an unbiassed estimate of s; . Hence, 
for all practical purposes, the estimates from Equation (29) have 
negligible bias. 

The confidence interval for M’ depends upon the variance and co- 
variance of 7, + T, and T, — T,. These can be expressed as a product 
of s; and a linear combination of the squares or products of the coeffi- 
cients 1 or } with which the original reactions are combined into 7, + T, 
and T, — T,. For T, + T, or the numerator, these coefficients a, 
total zero and >> a; = 4f — 4g, where g is equal to 1 plus the number 
of overall changes in dosage level during the assay. For T, — T, or 
the denominator, the coefficients a, again total zero and >> az = 2f + 
r + 39, where r is the number of reversals between overall changes in 
dosage. For their covariance, >> (a,a,) depends upon r and g; in the 
dosage patterns most likely to occur in practice, >> (a,a,) varies from 
— to}. 


The confidence interval may be computed from Equation (8) with 
C= (T2 >a T3)?/ 40s re T,)’ = (4f I i g)s t'/3} 
Vea/Y = 1 (8f — g)/4(4f + 2r + g) 
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and 


K = (C — 1) Do (a,a,)/2(4f + 2r + g) 


K is usually so small that it may be omitted and the confidence interval 
L calculated with Equation (9). 

To avoid computing a separate adjustment for each individual 
sequence, a second-order correction, a single pattern has been postu- 
lated for the routine assays in U.S.P. XV. On the assumption of one 
reversal in the order of y, and y, (r = 1), one change in the overall 
dosage level (g = 2), and negligible covariance, C reduces to 


CG — fae ae ere etie = Te co A(f a 1)s’¢/3} (30) 
and the ratio of the two variances in Equation (18) to 
ct” = (4f — 1)0’/8(f + 1) (31) 


These equations have been checked empirically by experimental 
sampling. With the averages from six assays in two laboratories as a 
guide, a hypothetical assay of posterior pituitary injection was con- 
structed. It was assigned an assumed slope of b = 37.5, a log-interval 
between U, and U, of i = log (1.5°) = 0.352, and a standard deviation 
for the initial reactions of s; = 1.5. Each mock assay consisted of 
three sets of doses (f = 3), had a true log-relative potency of u = 0.1, 
and required no overall change in dosage level. The reactions, 1 to 13, 
were assumed to decrease uniformly by steps of 1 millimeter from 34 
to 22. As reactions to the Standard, the odd-numbered values were 
unchanged; half of the even-numbered values, minus 2.85, were desig- 
nated as reactions to U, , and the remaining even-numbered values, 
plus 10.35, as reactions to U2. 

Three dosage orders were constructed, one with no reversal in the 
order of U, and U, , a second with one reversal (U,U,U,U,U2U;), and 
a third with two reversals (U,U,U,U,U,U,.). The random element 
was then added from a table of random deviates for a normal population 
with « = 1 and mean = O (Dixon and Massey [1951]), multiplying 
each random deviate, either + or —, by 1.5 before adding it to the 
expected first to thirteenth reaction. This process was repeated 30 
times for each of the three designs. Finally, each was analyzed as if 
it were an actual posterior pituitary assay, leading to an “assayed” 
potency M’ and error variance s’ by Equations (28) and (29), and to 
an approximate confidence interval based upon the “standardized” 


pattern by Equations (30), (31) and (18). 


The variability resulting from this model, with only one source of 


random variation and no residual effect of preceding doses, is summarized 


512 BIOMETRICS, DECEMBER 1956 


TABLE 6 
Resuuts or 30 Mock Assays ror Hacu or Toren DosaGh SEQUENCES OF A 
Posrerror Prrurrary Insection, (tHe “UNKNowN’’) Havine a “TRUE” 
Loc-PotTency or uw = 0.10 


Number of} ‘‘Assayed” M’ Average variance | Interval L from | Frequency 
reversals, of » outside 

r Within Between X y 

Mean Range Observed Expected} assays assays 

0 .1024 .0484, .1578) .3.395 3.375 1574 ~=.1514 2 

1 .0992 .0515, .1423) 2.893 3.281 .1420 .1258 0 

2 .1019 .0577, .1765) 3.063 3.188 .1522  ~=.1841 2 
Combined |.1012 .0484, .1765} 3.117 3.281 .1507 .1556 4 


for each design in Table 6. The mean of the “observed’’ M’ closely 
approximated the true value of » = 0.1 in each series of 30, but its 
range was considerable. The “observed” error variances differed but 
little from 3s;/2 = 3.375, and even less from expectations that had been 
corrected for the covariance introduced by the moving average (Table 
6). The average confidence interval, VJ > L’/30, based upon the 
variation within the individual assays, agreed satisfactorily with that 
computed from the variation of the 30 values of M’ between assays, 
2tsy,. The single pattern postulated in Equations (30) and (81) gave a 
slightly smaller expected confidence interval of 0.1442, when solved 
with the values adopted in setting up the mock assays. None of these 
statistics differed significantly from their expectations. Four of the 
90 ‘‘observed”’ confidence intervals did not bracket the true value of 
uw = 0.1, in good agreement with the postulated number of 4.5. So 
far as can be judged from these mock assays, log-potencies and confidence 
intervals computed with the simplified equations should meet all 
practical needs. 


THE CONFIDENCE INTERVAL FOR ALL-OR-NONE ASSAYS 


In an “all-or-none”’ reaction, each individual reacts or fails to react; 
the extent of its reaction is not measured. Whether the animal reacts 
or not depends upon the relation between its threshold and the dose at 
the moment of testing. If the dose is larger than its threshold, the 
animal reacts; if less, it does not react. A group of animals (or plants) 
is tested at each of several dosages that give positive reactions between 
0 and 100 percent. When these percentages are plotted against the 
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log-dose, they usually define a symmetrical, sigmoid curve resembling 
the cumulative normal, as if the logarithm of the individual threshold 
dose in the original population were normally distributed. 

To test this hypothesis, each percentage reaction may be trans- 
formed to the corresponding normal equivalent deviate (N.E.D.) or 
probit (N.E.D. + 5) and plotted against the log-dose. If the data are 
consistent with the above interpretation, the resulting curve will approxi- 
mate a straight line and the variance of each probit will be a function 
of its expected response. To take full advantage of the information 
supplied by each group, the log-dose response line may be computed by 
maximum likelihood with working probits and weights (Bliss [1952b]; 
Finney [1952a]). In experiments based upon this relation, the confidence 
interval for the log-relative potency and for other statistics can be 
computed with Equation (9) or its equivalent, as will be described 
presently. 

All-or-none data may be analyzed with two other transformations 
of the percentage response, the logit and the inverse sine. The logit is 
based upon the logistic curve, which closely approximates the normal 
from about 10 to 90 percent response, but departs increasingly at the 
ends. It requires weighting coefficients and presents computational 
problems similar to those found in probit analysis. The procedure and 
its logical basis have been developed by Berkson [1953] to whose papers 
the reader is referred. The second alternative, the inverse sine or angular 
transformation, is an equally good approximation to the normal over 
the same range as the logit (Claringbold et al [1953]; Knudsen and 
Curtis [1947]). It is justified by statistical convenience rather than by 
a biological model. If the groups are equal in size, the information in 
each angle is independent of its observed or expected value, and hence 
weighting coefficients are unnecessary. In consequence, an assay can 
be designed and analyzed factorially, with little change from a graded- 
response assay. If responses of 0 or 100 percent are at all frequent, or 
if interest centers in the ends of the response range, probit analysis 
would be preferred. 


Analysis in angles 

The angular transformation will be considered here in relation to 
factorial assays, although it can be extended readily to less well-designed 
experiments. A constant number of test animals (n’) is assigned to 
each of the k = 2 or more doses of the Standard and of the Unknown, 
spaced equally in log-units. Doses are selected that will produce 
similar reactions within the range from 10 to 90 percent, so far as this 
can be predicted in advance. 
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For analysis, each percentage reaction in the intermediate zone is 
converted to its empirical equivalent angle y, which extends from 
0° to 90° for percentages of 0 to 100 (Bliss and Calhoun [1954]; Finney 
[1952b]; Fisher and Yates [1953]). A reaction of 0 (or 100) percent dis- 
torts the curve less, however, if its empirical angle is replaced by a 
working angle, as illustrated below. Each such estimate is a function of 
its expected value, which depends, in turn, upon observations in the inter- 
mediate zone. When these expectations are less than 0° or more than 
90°, observed percentages of 0 and 100 contain no usable information 
and are omitted. For a maximum likelihood solution, all empirical 
angles would be replaced by working angles in successive iterations, 
but in view of the empirical basis of the transformation this refinement 
is here of doubtful value. 

The subsequent calculation is very similar to that for a graded- 
response factorial assay. Hach angle y is multiplied by its corresponding 
factorial coefficients x, or x , and the products summed to obtain 
T, = >, (ty) and T, = >> (x,y), from which the log-relative potency 
M’ is computed by Equation (16). 

The error variance of y has the expected value o” = 820.7/n’, 
where n’ is the number of individuals per group and is the same in 
all groups. If n’ were to vary between groups by less than 10 or 20 
percent, its harmonic mean could be used to compute an acceptable 
o. Dividing a sum of squared deviations by o° converts it to x” for 
testing agreement with the assumptions underlying an assay. The 
observed variation of the y’s about the fitted parallel lines totals 


l= ¥ — Tle — Ti/a — dS y/ dik (32) 


with n = >> k — 3 degrees of freedom, from which an approximate 
x = [y']/o°. If x’ is not significant or near significant, the slope factor 
C is computed by Equation (17), substituting o” for s’, and with f = 1 
and /? = 3.841. Given C and M’, no other change is required in com- 
puting L by Equation (18). However, if x” is significant or approaches 
significance, the dosage-response curves for the Standard and the Un- 
known are tested for divergence in slope and for single curvature. 
T; = >) x,y is computed with the factorial coefficients in rows ¢ = ab, q 
and aq of Table 1 or 2, to obtain for each row x” = T%/e;o” with one 
degree of freedom. If these explain the apparent heterogeneity, so 
that the residual x” = [y"]/o” — >) (T%/e,o”) is not significant, the 
validity of the assay is in doubt. Otherwise, the heterogeneity may 
reflect departures from random binomial variation that are not critical. 
These might occur, for example, if the individual test animals were not 
assigned to treatment groups entirely at random, although the treatment 
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groups were later assigned to doses at random. C may then be computed 
by Equation (17) with the empirical error variance s° = [y’]/n, Stu- 
dent’s i forn = >> k — 3 degrees of freedom, and f = 1. 

The calculation with angles may be illustrated by an assay of the 
relative toxicity of two insecticides, nicotine and lindane, in the adult 
milkweed bug, Oncopelius fasciatus (Turner [t955]). Adult bugs of 

_ the same age were anaesthetized by carbon dioxide, injected individually 
with insecticide, and held for 48 hours before determining mortality. 
From earlier experiments, the Unknown (lindane) was assumed to be 
40 times as toxic as the Standard (nicotine), and on this basis the two 
toxicants were injected at six corresponding dosage levels, varying in 
concentration by a constant factor of V2. 24 bugs were injected at 
each dosage level of each toxicant in concomitant tests, leading to the 
percentage kills in Table 7. 


TABLE 7 
ANALYSIS IN ANGLES OF A BaLancep Aut-or-None Assay or THE TOXICITY OF 
'Linpane Rewative to Nicotine WHEN Insectep into Apuut MInKWEED 
Bues in Groups or n’ = 24 (Turner [1955]) 


Coded log-dose for Percent kill Kill in angles, y 
5 doses 6 doses Nicotine Lindane Nicotine Lindane 
ea MH S U S U 
—2 —5 8.3 16.7 16.7 24.1 
—1 —3 20.8 25.0 Path Bt 30.0 
0) —1 37.5 O2n05 37.8 593 
1 1 62.5 79.2 o222 62.9 
2 3 83.3 83.3 65.9 65.9 
(3) 5 100 100 Some 86.5* 


*Maximum likelihood estimate (from Fisher and Yates [1953] Table XIV) with 
expected angle ¢ computed as ¢ = g + b’x’, where b’ = Ty/e, = 12.0, x’ = 3, 


and g = 39.94 for S and 47.0 for U, giving ¢g = 75.94 and ¢y = 83.02. 


i = .1505, #3 — Fy = 1.602, o? = 820.7/24 = 34.196, # = 3.841 


5-dose assay: T's = 35.4, Ty = 240.0, c = 4, M’ = .08880 (Eq. (16)), M = 1.6908 
(Eq. (10)); [y’] = 104.24 (Eq. (32)), x2 = [y’J/o? = 3.05, n = 7; C = 1.04779 
= .09304 + .09515 (Eq. (9)), Xw = 1.5999, 1.7902 


(Eq. (17)), e’2 = .18120, Xp 


6-dose assay: Ts = 38.8, Tp = 893.2, ¢ = 35/3, M’ = 
(Eq. (10)); [y*] = 136.84 (Eq. (82)), x? = 4.00, n = 9; C = 1.02359 (Eq. (17)), 


ci? = 26425, Xy = 


.07807 + .07984 (Eq. (9)) Xn = 1.6002, 1.7599 


.07627 (iq. (16)), M = 1.6783 


Since the largest dose of each toxicant killed 100 percent of the bugs, 
relative toxicity was first computed from the five smaller doses. Apply- 
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ing the five-dosee factorial coefficients in Table 1 gave the sums of 
products 7’, and 7’, , leading to the log-potency M@. From x’, the ob- 
servations agreed very well with the fitted parallel lines (P = 0.88), 
well within the requirements for assay validity and homogeneity. 
Hence C was computed with the expected variance o° = 34.196, leading 
to the log-confidence interval given in the table. 

The assay was then recomputed from all six dosage levels, with 
provisional maximum likelihood estimates for the two kills of 100 
percent. From the combined slope based upon the five lower doses for 
both toxicants of b = 240.0/20 = 12.0, the expected response at the 
highest dose («’ = 3) was ¢s = 39.94 + 12.0 X 3 = 75.94 for the 
Standard and ¢y = 83.02 for the Unknown. With these “expected 
values” and a table of the maximal working angle (Bliss and Calhoun 
[1952]; Finney [1926b]; Fisher and Yates [1953]), the required angles 
were y = 83.1 for the Standard and y = 86.5 for the Unknown. When 
recomputed with the six-dose factorial coefficients from Table 1, x7 
again indicated satisfactory agreement with hypothesis. Including the 
100 percent kills changed the estimated log-relative potency by 14 
percent and shortened the confidence interval by 16 percent. Whether 
these new estimates could be considered an improvement over those 
limited to the intermediate kills would be judged from their agreement 
with probit analysis, as described in the next section. 


Analysis in probtts 


Because they are based directly upon the model, probits are additive, 
a characteristic which takes precedence over equality of the variance. 
Dosages can be selected from the zone of most interest to the experi- 
menter, although the slope has its greatest precision when the response 
ranges from 7-10 to 90-93 percent. Each percentage reaction between 
0 and 100 percent is transformed initially to its empirical probit y’ 
and plotted against the log-dose x. The responses of 0 and 100 percent, 
which are — ~ and + ~ in empirical probits, may be indicated by small 
vertical arrows. 

For determining the “expected” probit Y at each dose, provisional 
dosage-effect curves are usually drawn by inspection, which limits 
their precision. If the doses are spaced at equal log-intervals and coded, 
parallel unweighted curves can be computed easily and objectively, 
initially from the empirical probits in the intermediate zone. For each 
reaction of 0 or 100 percent, a temporary “expected” probit Y is caleu- 
lated from the initial equation, and with this Y a preliminary working 
probit. ‘The provisional curves are then recomputed, including these 
new values but again without weighting, and solved for each coded 
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log-dose x’ to obtain the expected probits Y required for an estimate 
by maximum likelihood. Even with these precautions, the y’s are still 
provisional estimates, to be replaced in successive weighted approxima- 
tions which converge to the maximum likelihood solution. By providing 
a better starting point, however, they should reduce the number of 
iterations required for an acceptable precision. 

Given the expected probit Y for each group of n’ individuals, its 
working probit y = (Y — P/Z) + p/Z and weight w = n’(Z’/PQ) 
are determined with the aid of suitable tables (Finney [1952a]; Finney 
and Stevens [1946]; Fisher and Yates [1953]). The combined slope of 
the best-fitting set of parallel lines is computed in terms of x’, y and w as 


b! = Dd [wr'y]/ D> [w2’?] (33) 


where [wz”] = >> (wr”) — >~? (wr’)/>> w and [w2'y] = » (wry) - 
>> (wz’) >> (wy)/>. w for each curve. The extent of any divergence in 
slope can be measured. by x; , determined as 


Yeo, BE— B (34) 


where B; = [wz’y]’/[wx'’] for each curve and B? = >[we'y|/>> [wx’*]. 
The slope 6’, the difference in the weighted mean responses, Jy — Js , 
and the weighted mean coded log-doses €g and £% are substituted in 
the equation for the log-relative potency to obtain 


M’ = i {a5 — B35 + Gu — Gs)/0’} (35) 
where 2’ is the interval in logarithms corresponding to a unit difference 
in x’. Unlike the balanced factorial assay with equally-weighted y’s, 
the difference #{ — @¢ is seldom exactly zero. M’ is converted to a 
log-potency M by adding log R (Equation (10)). . 

Agreement of the working probits y with the fitted parallel straight 
lines, h’ in number, may be tested by the 


assay x = >, [wy’] — B’ (36) 


where [wy”] = SS (wy’) - >? (wy)/ > w for each line. As a convenient 
empirical rule, the assay x” is assumed to have n > k — h’ — 1 degrees 
of freedom, where k is the number of groups with both an expected 
response and an expected non-response of at least one-half individual. 
If the assay x’ is neither significant nor approaches significance, we 


compute 
C=B/B - F) (37) 


with 2 = 3.841 for n = © and P = .05. The assay x” may reveal 
heterogeneity among the test groups although no consistent curvature 
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or non-parallelism invalidates the assay. C' 1s then computed by 
Equation (14), with s° = (assay x’)/n, and # for n degrees of freedom. 
In either case the confidence interval is computed by Equation (9) with 


Vaa/Yoo = i ie + sat} > [v2’] (38) 


where >. [wx] = 2’? >> [wx”’] if the log-doses have been coded. 

These equations are readily extended to other probit analyses. For 
a selected level of response Y, such as Y = 5 for 50 percent, the cor- 
responding log-dose X59 may be computed from a single log-dose effect 
curve as X55 = €+ (Y — g)/b = £+X’. The slope factor C is then 
determined from the data for one curve, leading to the confidence 
interval 


Xx = CX’ + V(C — 1)(CX” + [w2")/ >> v) (39) 


In another application, the chemotherapeutic index (C.I.) of a 
drug may be defined as the ratio of its LD5, the dose causing 5 percent 
damage or death, to its ED95, the dose producing a characteristic 
therapeutic response in 95 percent of individuals of the same species 
under similar conditions. This is a difference in response of D = 3.290 
probits. If the respective dosage-effect curves have the same slope b 
within the sampling error, we may compute the logarithm of the chemo- 
therapeutic index as 


log (C.1.) = #2 — # — (J — 9: + D)/bd (40) 


where the subscripts , and , refer to the toxic and therapeutic effects 
respectively. The confidence interval for log (C.I.) is the same as that 
for an all-or-none assay except that M’ in Equation (9) is replaced by 
log (C.1.). 

The example in Table 7 has been recalculated in probits with coded 
log-doses x’ (Table 8). Each percentage dead in the intermediate zone 
was first converted by an appropriate table to its empirical probit y’. 
Two parallel lines were then fitted without weighting to the 5 paired 
values of x’ and y’ for each toxicant, with the common slope 6 = 0.5605 
and the intercepts 7j and g¢ at x’ = 0. Temporary expected probits 
of 6.44 for the Standard and 6.77 for the Unknown were computed 
_ from these equations at x’ = 3, leading to the provisional working 

probits y’ = 6.97 and 7.23 respectively, so that each x’ was paired with 
an explicit y’. 

Parallel, unweighted lines were recomputed from the six pairs of 
«’, y’ for each toxicant in place of the customary graphic estimates, 
and then solved for each x’ to obtain the expected probit Y for each 


TABLE 8 


ANALYSIS IN Prosirs oF THE ALL-oR-NoNE Assay IN TABLE 7 


Em- | Ex- Second 
Toxi- | Coded} Dead,} pirical) pected Weight, | Working Products weighted 
cant | dose |probit,| probit, probit, estimate 
“ne Jo y’ iY; w y 
wa! wy w y 
Nico- ee ts TOds| (OO 6.8 3.62 |—13.6 24.616) 6.8 3.62 
tine =I) 2028) |) 4:19 +)4.18 11.9 4.19 |—11.9 49.861]11.8 4.19 
S 0 | 37.5 | 4.68 | 4.81 15.1 4.68 0 70.668/15.0 4.68 
1) | 6225 | 5.32 | 5.44] 14.2 5.32 14.2 75.544/14.4 5.32 
2 | 83.3 | 5.97 | 6.07 | 10.0 5.96 20.0 59.600/10.3 5.96 
3 | 100 | 6.97*| 6.70 5.0 vel? 15.0 35.850) 5.3 7.13 
Total 3 30.74 63.0 23.7 316.139/63.6 
Lin- —2 | 16.7.4 4.03" | 3:87 9.5 4.05 |—19,0 38.475) 9.5 4.05 
dane —1 | 25.0 | 4.33 | 4.50 13.9 4.33 |—13.9 60.187/14.0 4.33 
U 0 | 62.5 | 5.32 | 5.13 15.2 5.32 0 80.864)15.2 5.32 
TES CAE bad | Pies esto hers eae (cya | we Ae 5.81 12.3 71.463/12.4 5.81 
2 | 83.3 | 5.97 | 6.40 7.2 5.82 14.4 41.904) 7.5 5.85 
3 |100 7.23*| 7.03 3.0 7.45 9.0 22.350) 3.1 7.42 
Total 3 32.69 61.1 2.8 315.243/61.7 


Unweighted, omitting z’ = 3:>o2’y’ = 11.21, b = .5605, 7s’ = 4.754, Gy’ = 5.092 
at z’ = 3, Ys = 6.44 and Yy = 6.77, giving provisional estimates* 
Unweighted, including z’ = 3: oz’ = 8, [x’2] = 17.5, )o[x’] = 35, Do[z'y’] = 22.095 
b = .6313, Ys = 4.808 + .631z’, Yy = 5.133 + .6312’ 


First weighted estimate 


Second weighted estimate 


40094 

5.03212 
132.076 
85.758 


55.683 
57.175 
1.492 


a | 


Statistic 
S U Both 
wal .37619 .04583 .33036 
7] 5.01808 | 5.15946 . 14138 
[we'?] |129.384 {119.872 249.256 
[w2’y] 84.273 70.738 |155.011 
b .62189 
B? 54.890 41.743 96.401 
[wy?] 56.498 45.735 
ee 1.608 3.992 |x? = .232 
Assay x? 5.832, n = 9 
_M', M .08393, 1.6859 
C; Vaa/Vob i, 04150, . 18202 
3L .08865 


U Both 
05997 | .34097 

5.16823 | .1361i 
122.078 254.154 

72.108  |157.866 

62114 

42.592 | 98.057 

46.378 

3.786 |x? = .218 
5.496, n = 9 


.08430, 1.6863 
1.04077, .18381 


.08829 
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treatment. Given the Y’s and tables of the weighting coefficients 
(Z?/PQ), the minimum probit (Y — P/Z) and range (1/2), the weights 
w and the working probits y followed readily. Two columns of products 
completed the initial part of the computation. 

The calculation of the first weighted estimate is shown in the lower 
part of Table 8, leading to M’ = 0.08393 and L = 0.1773 by Equations 
(33)-(38). With assay x” = 5.832 for 9 degrees of freedom, the assay 
proved satisfactorily homogeneous. The approximation has been 
carried a stage further by computing improved Y’s from these weighted 
curves, leading to new weights w and working probits y for a second 
weighted estimate. Nine of the 12 y’s in this second approximation 
were unchanged and the remaining three differed so little, that the 
second weighted M’ = .08430 agreed within one-half percent with the 
first weighted estimate. The assay x” dropped to 5.496 and the con- 
fidence interval L to 0.1766, all representing so little change as not to 
warrant a third approximation. By starting with an unweighted, 
computed estimate, the first weighted calculation can often be con- 
sidered a very fair maximum likelihood estimate, even though the 
weights may vary as much as fivefold among themselves. 

The computed probit estimates were approximated more closely by 
the angular calculation when the latter was restricted to the interme- 
diate kills. This is consistent with limiting the calculation in angles 
to the zone where they most nearly parallel the probits. When the 
doses killing 100 percent were included, and the computation was con- 
tinued to a maximum likelihood estimate paralleling that for probits, 
successive values of M’ were 0.07627, 0.07260 and 0.07188. Since the 
angular transform is justified theoretically by its approach to the probit, 
a complete maximum likelihood solution which includes 0 and 100 
percent kills has little to recommend it. 


THE CONFIDENCE INTERVAL FOR THE RATIO OF TWO MEANS 


As noted before, when the standard deviation of the threshold dose 
is a small enough fraction of its mean, a normal and a log-normal 
distribution cannot be distinguished empirically. This is sufficiently 
true of the pigeon assay for digitalis that its potency is computed in 
U.S.P. XV directly from the ratio of the mean threshold doses for the 
_Standard ds, and for the Unknown i, as 


‘ ge a ds /dy (41) 


where v is the threshold dose killing an individual pigeon and R is the 
ratio of the number of ml. of the Standard preparation to the number of 
ml, of the Unknown preparation, each in 100 ml. of their respective 
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dilutions. Since each pigeon is assigned to a preparation at random and 
titrated separately, the numerator and denominator of P,, are inde- 


pendent and hence have zero covariance. The pooled variance of », 
Ss’, 18s computed by Equation (2), substituting v for z. The error variance 


of dy in the denominator of P, iss’'/Ny = »,. Substituting in Equa- 
tion (8), we have C = d¢/(0¢ — s°t?/Ny), leading to the confidence 
interval 

L = 2V(C — 1)(CPZ + Rfe/fs) (42) 


from which Xp. = CP, + 30. 


SLOPE-RATIO ASSAYS 


In some assays, the response y can be plotted as a straight line 
against arithmetic dosage units. The linear relation is usually restricted 
to the smaller dosage levels, and depending upon the nature of the 
response, the slope may be either positive or negative. When the 
plotted lines for the Standard (subscript 1) and for each of the m — 1 
Unknowns (subscripts 2 to m) meet at zero dose within the sampling 
error, the relative potency is given by the ratio of their slopes. Al- 
though this is not essential, the analysis is considerably shortened 
when the & doses of each preparation are spaced at equal arithmetic 
intervals and coded, such as to x = 0, 1, 2, 3, --- , k (Barraclough 
[1955]; Bliss [1946], [1952b]; Finney [1952b]; Wood [1953]). The slopes 
are computed with a common intercept a’ at x = 0, so that 


Vo a Ot, Fg = 0 bor, Yo, = a + One (43) 


In coded units, the relative potency Py , for Unknown 2, for example, 
is the ratio 


PL = b,/b; = a/b (44) 


To recover the potency P,, in original units, Py is adjusted for differ- 
ences in the coding units by multiplying by R = I;/Iy , the ratio of 
the dosage intervals. 

When the assay is fully balanced, with the same number of replicates 
at each of the k dosage levels, the inverse matrix of c;;’s required for 
computing the intercept a’ and the slopes }; , bz , +++ , Om need not be 
determined explicitly. If control responses at x = 0 have been included 
in an experiment with a total of km + 1 doses and each treatment has 
f replicate responses y which total 7’, , the common intercept 1s deter- 
mined as 


Sgr obras ye 6 Th (45) 
~ f{2(2k + 1) + mk(k — 1} 


a 
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P oe: me 
where >. y is the total of all N = f(mk + 1) responses and T;, = inal 
for each preparation. In the absence of a zero dose, N = fmk and 
the equation for a’ reduces to 


yf — 2ehAY My 6 De 


fmk(k — 1) 
In either case, the slope for each preparation is computed as 
3 oT! \ 
i ts 4 
fae oe tie alice Se 


Tests are available (Barraclough [1955]; Finney [1952b]) for deter- 
mining whether the dosage-response curves for all preparations are 
satisfactorily linear and have a common intercept within the sampling 
error. The variation about these lines is assumed to be independent 
of dose and of preparation, a condition which is usually met more 
easily as each Py approaches unity. If the assay meets these require- 
ments for validity and the response units have been placed and handled 
entirely at random, the error variance about the m lines can be deter- 
mined as 


Ma 2) ioe 9 Din Famine DATE RAY F (47) 
where the degrees of freedom n = N — m — 1. 

Many experimenters handle replicate tubes together throughout 
an assay, and in this case the variation among the f replicates within 
treatments may underestimate the assay error. The error variance, 
adjusted to units of a single tube, is then computed from the variation 
of the treatment totals (7',) about the fitted lines as 


g=(T/i-—a Ly Yorn (48) 
with n = m(k — 1) degrees of freedom. 

With the matrix of ¢;;’s and the error variance s’, the variances and 
covariance of the two slopes which form the numerator and denominator 
of Py can be obtained in the general case (Bliss [1946]) as »,, = cs”, 
Vaa = C228, ANd V4, = C28. These terms are substituted in Equation (7) 
to obtain C = bi/(bi— c:s°t’) and in Equation (8) to obtain 2Z and the 
confidence limits of P{ as 


Xp, = CP, — K + V(C = 1)(CP? $ exs/en) + K(K — 2CP{) (49) 


where K = (C — 1)cey2/c,, and Student’s ¢ for the required level of 
significance depends upon the degrees of freedom in s”. 

In the balanced assay described above, with equal intervals between 
doses and a constant number (f) of replicates for each preparation, the 
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confidence interval can be computed readily. The ratio ¢:./ce,, = 1 

) 
and the terms needed for computing C and K have simple forms. With 
f values of y at x = 0, the required constants are 


= 6 1 3 
eS F(Shist 1) ‘za ee Bont ee nie = 5} (60) 
and 


Cia _ 3k(k + 1) 
C1 (8k + 1)(kK + 2) + mk(k — 1) 


(51) 


_If there is no observation at x = 0, these reduce to 


6 1 : 
Cia fk(2k + 1) i +1 a5 mk — a 
and 


C12 3(k + 1) 
GQ, 3k +1) + mk — 1 


The slope-ratio design has its main application in microbial assays 
of vitamins and amino acids, and potentially in biochemical research. 
With the above equations, exact confidence limits are no more difficult 
to compute than the usual approximations. 


TABLE 9 


Catcutation-or Re.ative Potenctes P{ anp Tuetr Conripence Limits X p,» 
IN THE Assay or.NIACIN CiTep By BaRRracLouGH [1955] 


Preparation 
Term 

1 2 3 4 5 
TL => (aT;) |80.1 65.5 67.8 65.7 65.8 
b; (Eq. 46) 1.45714 .93571 1.01785 94285 94642 
PJ (Eq. 44) 64216 69853 64706 64951 
CPy ET 4 64370 70052 64864 65111 
K(k — 2CP J) — .00466 — .00507 — .00470 — 00472 
iL (Eq. 48) 0818 0831 .0819 0820 
Xp, (Eq. 48) .562,.726 617, 7784 ° .567, 781 © .569, 733 


Se tiers oe a ee eS eee 
= = = 3.2750 (Eq. 45) bs = + 

y = 168.4, Tf = 344.9, k = 3,m = 5, f = 2,0’ 
ih — a} (Eq. 46), 5.1?/2 = 923.270, s? = .05242 (Eq. 48), n = 10, #? = 4.965 
(at P = .05), cy = .06493 (Eq. 50), C = 1.008024 (Eq. 7), ¢2/en = 45000 (Eq. 


51), K = .003611 
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The calculation may be illustrated with the data from the balanced 
niacin assay cited by Barraclough [1955]. Each of m = 5 preparations, 
a Standard (preparation 1) and four Unknowns (preparations 2 to 5), 
was tested at k = 3 dosage levels with f = 2 replicates. The assay 
included two zero-level tubes. The totals 7, of the 3 paired measure- 
ments for each preparation were multiplied by the coded doses, x = 1 
to 3, to obtain the 7 in the first row of Table 9, leading to the values 
of b;-and Pf in the next two rows. The preparations proved satis- 
factorily homogeneous in individual tests for validity. In addition, the 
mean square for scatter about the five fitted lines divided by that 
between duplicates gave a non-significant F = 1.18 (n, = 10, n. = 16). 
However, since F > 1 and randomization was not reported explicitly, 
the error variance s°> = .05242 has been computed here by Equation 
(48), leading to C = 1.008024 and the confidence intervals X p,, in the 
last row of the table. 


SUMMARY 


Bioassays are of two types, those in which the threshold dose of a 
given drug is a dependent variable, measured directly in each test 
animal, and a far larger class in which dose is the independent variable 
and the size of the reaction is the measured dependent variable. In the 
few assays of the first type, the log-potency is computed from the 
difference between two means or from a mean difference; its confidence 
limits are based upon familiar assumptions and equations, which are 
reviewed briefly. In the great majority of assays, the potency or its 
logarithm is computed from the ratio of two statistics and its confidence 
limits are the roots of a quadratic equation. This equation in an 
especially simple form, which was proposed originally by Marks for 
the cross-over assay of insulin, has been generalized for all assays based 
upon a ratio, including those with a significant covariance between 
numerator and denominator. Some of its applications are illustrated 
with examples of vitamin, hormone and insecticide assays. 

The general form for limits without covariance has been adapted 
for selected parallel-line assays, many appearing in U.S.P. XV. These 
include balanced factorial assays with one Unknown; partially balanced 
factorial assays, where the Standard and Unknown differ by a missing 
end dose but are otherwise equivalent; factorial assays with unequally 
spaced log-doses, which occur frequently in microbial assays; assays 
with more than one Unknown tested concomitantly with the same 
‘Standard; and assays in balanced pairs, such as the twin cross-over 
assay for insulin. In each of these, the numerator and denominator 
are uncorrelated by virtue of the design. 
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In several parallel-line assays, potential correlation between numer- 
ator and denominator is usually negligible and can be ignored in com- 
puting the confidence interval. Among these are assays in randomized 
sets, where the sets may differ significantly in the slope of the log-dose 
response curve; factorial assays in which missing responses are replaced 
later by calculation; and assays from the successive reactions of a single 
animal or piece of tissue, in which the responses to a constant dose of 
Standard are converted to moving averages when calculating the log- 
potency. In this last case, the approximate equations proposed for the 
U.S.P. XV posterior pituitary assay have been tested by experimental 
sampling. 

The confidence interval for all-or-none assays depends in part 
upon the transformation adopted for the percentage response, although 
covariance is not a complication. With the more convenient angular 
transformation and responses in the range from 10 to 90 percent, the 
simple equations for balanced factorial designs are applicable. The 
additive probit transformation follows directly from the underlying 
mathematical model and imposes no theoretical limitation upon the 
range of response, but unequal weighting and an iterative solution 
increase the amount of computing. If the assay is designed factorially, 
the log-doses can be coded and objective expected probits computed 
from a preliminary analysis in unweighted probits. These lead to a 
weighted maximum likelihood solution with appreciably less calculation. 

In assays based upon the threshold dose, potency can be computed 
directly from the ratio of two mean doses without the use of logarithms, 
as in the U.S.P. pigeon assay for digitalis; its confidence interval is 
that for a ratio without covariance. Logarithms are also avoided when 
the response is a linear function of the dose rather than of the log-dose, 
and potency is determined from the ratio of two slopes as a slope-ratio 
assay. The numerator and denominator of this ratio are always cor- 
related, so that covariance enters unavoidably into the calculation of 
its confidence interval. Relatively simple equations are presented for 
balanced assays with several Unknowns and the same number of 


replicates for each treatment. 
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QUERIES 


Grorce W. SNEpEcor, Editor 


QUERY: In Biometrics 11, No. 3 (1955) G. A. McIntyre analyses 
123 the results of a complex experiment in which 4 treatments are 

applied in two latin squares to 4 different test plants and leaves 
in 4 positions in the test plants. The results of treatments are then 
assayed in 4 Graeco-Latin squares each with 4 assay plants, 4 positions 
of leaves within the assay plants, and two treatments on the two halves 
of each leaf. 

Clearly the variance component a? in line 15 of table 2 has been 
obtained as half the difference of the mean squares in the lines 10 and 14. 
We can, however, obtain an estimate of a, in another way. The ex- 
pected values of the mean squares in the lines 8 and 9 are identical and 
the mean squares can be pooled; and likewise for lines 12 and 13. This 
gives 


iE Pooled mean square 
Line 8 and 9 24. 38.41 
Line 12 and 13 24 18.52 
Difference 19.89 


Now it will be seen that the difference between the expected values in 
lines 8, 9 and 12, 13 is again 2c, , so that we obtain a second estimate 


ox, = 19.89/2 = 9.95 


instead of the value of 2.10 of McIntyre. They differ almost by a 
factor 5; which estimate is the correct one? 

This question raises a more general problem. In McIntyre’s Table 1 
we find 14 different mean squares. The mean squares in lines | and 11, 
8 and 9, and 12 and 13 have identical expected values and can therefore 
be pooled. Even then, however, 11 mean squares remain against only 
8 variance components, so that these variance components are not 
uniquely determined. 4 

Usually one finds in the textbooks that in an analysis of variance the 
number of mean squares is equal to, or with confounding less than, the 
number of variance components, and I am inclined to the opinion that 
this should always be so. If this opinion is correct we must conclude 
that the analysis carried out by McIntyre 1s not correct and can be 
improved, but the experiment is of so complicated a nature that it is 
not easy to see what is the correct way and how to find it. If I am 
wrong, that is if the number of mean squares can be larger than the 
number of variance components, then the question to be solved is how 
these variance components should be estimated. 
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The analysis recorded in ‘Design and Analysis of Two 
ANSWER: Phase Experiments’? has given unbiased estimates of 

treatment effects and an appropriate estimate of treatment 
error. The pooled analyses of the variation within Graeco-Latin 
squares together with the analysis of the Latin squares has given 
unbiased estimates of location components adequate for the purpose 
of illustrating the effect of modification of the design. 

However, the estimates of components are not the most efficient. 
For instance, as indicated by the querist, further information on o4 
can be obtained from the analysis of variation within Graeco-Latin 
squares. The variation between letters within an alphabet in any 
square is the variation in lesions due to virus concentrations associated 
with particular combinations of light treatment effects and leaf position 
on the test plant. This variation is potentially different for each 
alphabet in each square. Those identified as Latin originate from the 
first Latin square. The components given in Table 2 of the paper refer 
to the mean square taken over all four Graeco-Latin squares when 
orthogonality of treatment and test leaf positions obtains. 

The estimate of 2cx given by the querist is the average of eight such 
estimates from the eight differences of sum and difference mean squares, 
one from each alphabet in each square. These estimates are not scaled 
x’ variates but have variances which involve the associated alphabet 
as well as a... An estimation for the variance of the average estimate 
of 2cx , subject to the usual assumptions of normality for A and «, is 


Prll6(or + orr)(os + 0%) + Boa + 80a0. + 4o%] 


This involves several poorly determined components. Starting with 
the estimates given at the foot of Table 2 one can by a least squares 
iterative process arrive at a combined estimate of 2c. from this source 
and from the difference of the error mean squares for the sum and 
difference analysis. 

Some of the information within squares on o, can be recovered 
through a recasting of the analysis of the assay phase. At the same 
time this rearrangement clarifies the relation between the analyses of 
the two phases. Because these analyses have data in common there is 
some overlap in estimators for components but they are not independent. 
This overlap seems to have been the cause of the querist’s difficulties. 

For two phase designs of the types described in the paper and with 
replication in the assay phase the data for the first phase analysis is 
the sum of the second phase replicates deriving from material in a first 
phase plot. An analysis of the assay phase, which would usually be 
made only to estimate second phase components, will isolate certain 
degrees of freedom, sum of squares and corresponding components 
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which in total will equal the sum of the degrees of freedom, sum of 
squares and components of the first phase analysis. The remaining 
sources of variation in the assay analysis will contain only second phase 
components. 

In the numerical example the entries into the first phase design are 
the totals of the Latin and Greek letters within squares of the second 
phase design. The degrees of freedom and sum of Squares and com- 
ponents between test plants can be identified with sets and plants 
within sets in the Latin analysis and with alphabets within squares and 
squares (not given in Table 2) in the Graeco-Latin analysis. There is, 
however, no correspondence in the second phase analysis for the re- 
maining 24 degrees of freedom in first phase analysis because of the 
method followed to isolate the whole plot and sub-plot errors. By 
rearrangement a correspondence can be achieved. 

The Graeco-Latin square (in the sense of Yates, not of Cochran 
and Cox, Experimental Designs) is a partially balanced incomplete 
block design with two associate classes and with grouped blocks ortho- 
gonal to treatments in two directions. The blocks are whole plots or 
paired sub-plots. The analysis given in Table 2 is essentially the intra- 
block analysis with treatments eliminating blocks given by the sum 
of Alphabets, Latin (differences) and Greek (differences). Effects due 
to assay plants and leaf positions within plants can be removed from 
the block sum of squares, the residual block effect ignoring treatments 
consisting of the sum of the whole plot error, Latin (sums) and Greek 
(sums). In the auxiliary analysis the sums of squares between Latin 
letters and between Greek letters are substituted for Latin (differences) 
and Greek (differences) to give treatments ignoring blocks. In the 
parallel adjustment of the residual block effects to eliminate treatments 
the whole plot error remains unchanged, the other terms being Latin 
(sums) plus Latin (differences) less Latin and Greek (sums) plus Greek 
(differences) less Greek. The rearranged analysis with components is 
given in the accompanying table. 

With this arrangement, the totals of the degrees of freedom, sum 
of squares and components for the first four terms of this analysis are 
the same as the corresponding totals of the Latin analysis, being different 
analyses of the same entries. The remaining degrees of freedom are 
associated only with assay components. The test plant components 
are estimated from the Latin analysis and the assay components from 
the mean squares corresponding to these remaining degrees of freedom. 
The whole analysis of the 127 degrees of freedom could be formally 
presented with 31 degrees of freedom for the first phase analysis and 
the remaining 96 degrees of freedom as given in the latter part of the 


auxiliary analysis, 
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The variance component ¢?2 can be estimated as Eva 0:60 527.60) 
or 2.28 in place of 4(11. 80 — 7 60) or 2.10. With some slight gain in 
precision, estimates of v4 and o can be deriv ed from the three scaled 
x estimators of eet Cee ae a , and of with 24, 12 and 36 df. 
respectively by maximum likelihood procedure. Miter one cycle of 
iteration the estimates are of = 2.21, 0? = 7.63. 

In estimating components in the analysis of these data it has been 
assumed that interactions among treatments and sources of variation 
from plant material can be ignored. It should also be mentioned that 
unconscious variation in technique in applying virus to the leaves 
in the assay phase may have contributed to variation in lesion counts. 
If such variation in technique is random then the effect will be con- 
founded only with o? but non-random variation would be also con- 
founded with some or all of the other location components and perhaps 
treatments. 

The supposition of the querist that in general the number of mean 
squares with different expectations in an analysis must not be greater 
than the number of components is not true for any repeated incomplete 
block design since there are two or more estimates with different expec- 
tations involving only block and plot effects. Even with a single 
classification one can have an analogous case. 

Suppose that a random sample of strains of a species is tested for 
variation in yield, using a completely randomised arrangement. The 
strains are randomly divided into two groups, the k, members of the 
first group being replicated r, times, the k, members of the second 
group rz times. The data could be analysed on Model II as below into 
four contrasts, each involving one or both of two components of varia- 
tion. On the assumption that both the strain means and replicate 
elements of variation are normally distributed these mean squares are 


scaled x” variates. 


Variation Degrees of freedom Expectation of mean square 


2 
os Ge 


i ritolky ea ky) 1 
Between groups 5 eT ares d 


Within group 1 k, —1 Ta il 


Within group 2 kz — 1 To 1 
Replicate error Kir, — 1) + ka(re — 1) 1 
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The mean square between strains taken as a whole has an expecta- 
tion of 


i) riky “: Hs) : 2 

Ti + ke iar il {rk am rakes rik, a 2K» a ss ae 

which is not distributed like x. The conventional procedure would be 

to estimate o? from this in conjunction with the replicate error mean 

square. A more efficient estimate could be obtained from the four 
mean squares of the analysis by maximum likelihood procedure. 


== 


f Ge G. A. McIntyre 


ABSTRACTS 


Papers presented at joint Biometric Society (ENAR), A. S. A. and I. M. S. Sessions 
Detroit, September 7-10, 1956. 


DAVID W. ALLING (Hermann M. Biggs Memorial Hospital, 
389 Ithaca, New York). The After-History of Pulmonary Tubercu- 
losis; A Stochastic Model. 


Patients with pulmonary tuberculosis were classified at the time of 
diagnosis and at succeeding annual intervals as being in one of the 
following three clinical states: 


R_ having arrested tuberculosis 
A having active tuberculosis 
D_ being dead of tuberculosis 


By suitable division of these three states into six substates a discrete, 
stationary stochastic model is constructed which is found to correspond 
reasonably well with sets of observed data. 


IRA A. DeARMON, JR. Control of Precision in the Plate 


390 Count Assay. 


In estimating the number of viable organisms in a sample concen- 
tration of bacteria, the measurement consists of the plating operation 
and the serial dilution operation. The variability of each of these 
operations has been evaluated for a fastidious microorganism and has 
been shown to be consistent with that variability expected in a Poisson 
distribution. 

Utilizing the x’ distribution, a control procedure has been suggested 
whereby both the plating operation and the serial dilution operation 
can be assessed for excessive variability on the same control chart. 
This control procedure has been applied with success to an actual 
program carried out in a biological laboratory. 


D. B. DeLURY (Ontario Research Foundation). Elements 
391 of the Analysis of Covariance. 


The point of view followed throughout the paper is that the analysis 
of covariance is simply regression theory adapted to data which contain 
a considerable amount of orthogonality. Two examples are discussed, 
to bring out a little of the algebra by which standard regression theory 
is organized in the covariance pattern and some of the arithmetical 
technique that goes with it. One example involves two samples, each 
with several observations, and one concomitant variable, the other is 
based on an orthogonal experiment with two concomitant variables. 
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CHARLES W. DUNNETT (Lederle Laboratories, American 
392 Cyanamid Company, Pearl River, New York). Multiple De- 
cision Procedure. 


This is an expository paper describing the use of multiple decision 
procedures as an alternative to the analysis of variance in problems 
concerned with selecting the ‘‘best”’ of several treatment categories. 
The particular procedures described are those due to Bechhofer (Ann. 
Math. Stat., 25: 16-39) and Somerville (Biometrika, 41: 420-429). 


MINDEL C. SHEPS AND PAUL L. MUNSON (Harvard Uni- 
393 versity). The Role of Between-Assay Error in Biological 
Assays. 


- Discrepancies in relative potency estimates within a biological assay 
method may be due to deviation from one or more of the assumptions 
implicit in the assay model. In a chick comb assay method for andro- 
genic potency, observed variation between replicate M’s for urine 
extracts had .been three times as great as predicted. No systematic 
causes were apparent from a study of the data. 

A change in technique that simplified the problems of handling the 
material to be applied to the comb resulted in only slight changes in 
the intra-assay variability, while the residual variance and assay 
slopes were still heterogeneous. However the observed variation of 
the M’s for urine extracts now agreed very well with the intra-assay 
predictions. M’s for steroids of varying chemical structure had an 
observed variance 1.7 times as great as predicted. 

An experiment was performed consisting of 4 assays (replications) 
comparing the effects of 4 equally spaced log doses of androsterone 
(the reference standard), a urine extract (L25) and the steroid methyl 
testosterone (MT). In a combined analysis of variance the only signifi- 
cant interaction found was for assays by differences in the slope of MT 
as compared with the other two slopes. The observed variation in M’s 
for both unknowns was not significantly different from that predicted. 

The M’s obtained with a parathyroid assay method were also not 
excessively variable despite the marked variation in residual s” and, 
in many of the assays, poorly determined slopes. 

The present data as well as numerous reports in the literature suggest 
that inter assay variation, both in biological and chemical assay methods, 
can be detected only if Seana sought for, When present, it should 
be included in estimates of error. When it is due to a cause other than 
non-parallelism it is often possible to diminish it markedly or to eradicate 
it completely. 
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394 R. LOWELL WINE (Virginia Polytechnic Institute). On the 
Comparison of Multiple Test Procedures. 


Testing the homogeneity of a set of means by the F-test may result 
in the conclusion that they are not all alike, but it fails to signify any 
arrangement of distinguishable groups among the means. The problem 
of separating a group of heterogeneous means into subgroups of means 
said to be non-heterogeneous has been approached in several different 
ways by a number of research workers. This has caused many workers 
outside the field of statistics, who in the past have used the least signifi- 
cance difference test, to wonder which of the many available tests should 
be applied in a given experiment. Using the least significant difference 
test as a starting point, some of the best multiple range tests are com- 
pared to this familiar test with respect to error rates and ease of applica- 
tion. It is pointed out that the multiple F-tests, being more difficult 
to apply, probably should not be used unless the investigator is inter- 
ested in contrasts involving more than two means. 


Papers presented at joint Biometric Society (WNAR) meetings and I.M.S. sessions, 
Seatile, Washington, August 22-24, 1956 


MOHAMED 8. AHMED (University of. California, Berkeley, 
395 California). A Stochastic Model for the Tunnelling and Re- 
tunnelling of the Flour Beetle. 


The stochastic model developed to describe the tunnelling and 
retunnelling of the flour beetle is a Markov chain with only three 
states: s,; , tunnelling; s, , stationary; and s, , retunneling, and with 
continuous time parameter and stationary transition intensities 
qgi:(t, 7 = 1, 2,3). Transition from one state to any other is visualized 
with the restriction that the beetle cannot move from the tunnelling 
state to the retunnelling state, or vice-versa, except by passing through 
the stationary state. 

For this model: 1) the probability that the beetle is in state s; at 
the end of the time interval ¢ given that it was in state s; at the beginning 
of this time interval, as well as 2) the expected time spent in a state s; 
out of the total exposure time 7, is found explicitly in terms of the 
four unknown parameters qi2 , Y21 » G23 aNd sz. In addition, a scheme 
for the estimation of these parameters is given. 

This model can be used to study the differences in the behavior of, 
say, the male and the female beetles with regard to the proportion of 
time they spend in the various states. | 
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WALTER BECKER. An Investigation of the Log Transforma- 


396 tion of Growth Data. 


In order to justify transforming rat body weight data to logs, an 
investigation was made of the birth to 98 day weight data expressed 
in grams and log grams. The data in grams indicated almost complete 
proportionality of the within litter standard deviation with the un- 
weighted mean when plotted on log-log paper, also a large significant 
interaction mean square between the two sexes and litters was found 
in the analysis of variance. .Transforming to logs materially reduced 
the proportionality and succeeded in removing most of the interaction. 


P. HORST (University of Washington, Seattle, Washington). 
397 Optimal Estimates of Multiple Criteria with Restrictions on the 
Covariance Matrix of Estimated Criteria. 


In the prediction of personal adjustment a generally appropriate 
model involves a set of predictors as independent variables and a set of 
criterion or dependent variables. The conventional “least squares” 
methods for estimating the criterion variables from the predictor 
variables have long been well known. However, in problems of differ- 
ential classification it may be desirable to impose certain restrictions 
on the estimated criterion variables. One set of conditions which 
appears to be useful in this connection is that the variance-covariance 
matrix of the estimated criterion variables shall be as nearly equal as 
possible in the “least square’ sense to a prespecified grammian matrix. 
One such matrix is a function of specified quotas in the differential 
classification problem. It is proposed that an appropriately constrained 
“least squares’ solution for the estimated criteria may provide a basis 
for differential classification which will approximate specified quotas 
and which is much simpler in application than the conventional linear 
programming approach. The solution for the matrix of regression 
vectors with specified restrictions is derived. 


HAROLD HOTELLING (Institute of Statistics, University of 
398 North Carolina, Chapel Hill, North Carolina). New Light on the 
Multiple Correlation Coefficient. 


This paper is analogous to the author’s work on the simple correla- 
tion coefficient r (J.R.S.S., Vol. 15B, 1953, 193-232), but deals with 
problems, both mathematical and logical, of a somewhat different 
kind. Starting from Sir Ronald Fisher’s work on the distribution of 
the multiple correlation coefficient R, emendations are made regarding 
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the geometrical derivation and the asymptotic approximation. Im- 
proved methods of calculating the A distribution of R are obtained. 
An investigation is then made of transformations of R corresponding 
to those of the simple correlation coefficient r. The problem for R is 
considerably different because R is restricted to positive values and 
has great skewness even for the case analogous to symmetry for r. 
This leads to a very different situation in such problems as discriminat- 
ing between the values of the correlation in different samples, or averag- 
ing of transformed correlations. Consideration is given to the “shrink- 
age”’ of R alleged by some psychologists, and to other problems arising 
in applications, including test construction and selection of predictors 
by choosing maximum multiple correlations. 


FLOYD A. JOHNSON (Pacific Northwest Forest and Range 
399 Experiment Station, U.S.A.). The Role of Statistical Methods 
in Forest Research. 


Statistical methods had a rather minor role in forest research until 
the early 1930’s. Before that time the terms statistical methods and 
calculation techniques were practically synonymous. Since that time 
the work of the English statisticians, particularly R. A. Fisher, has 
been responsible for a slow revolution in forest research procedures. 
Statistical considerations now affect all phases of the experimental 
process; the planning phase, the field phase and the analysis phase. 
The development of inferences from experimental data is recognized 
as a statistical problem, and these inferences are no longer left un- 
qualified. Nowadays foresters in research are increasingly preoccupied 
with determining how far a sample estimate may have missed its mark 
or how much chance had to do with apparent differences among ob- 


served phenomena. 


400 LINCOLNE.MOSES. Some Useful Non-parametric Techniques 


Occasionally one may have a fully known population distribution 
F(x) and desire to test the hypothesis that he is sampling from that 
population against slippage alternatives. Such a known distribution 
will often be found tabulated in The Statistical Abstract, and not have 
a specified functional form. The test statistic proposed is 


x EG) > 
V/12N 
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which may be taken as a unit normal deviate for N > 5. The test is 
in fact Wilcoxon’s test for one sample size infinite, and so enjoys the 
same properties. 

In the experimental set up associated with Friedman’s analysis of 
variance by ranks one may have in mind an alternative hypothesis 
which makes desirable the identification of a “large treatment” (if 
any); an overall test of homogeneity as provided by Friedman’s statistic 
is not especially sensitive for this purpose. The statistic proposed is the 
maximum rank sum, which, if sufficiently large, leads to rejection: of 
the null hypothesis. The asymptotic distribution of the suitably 
normalized statistic is just that of 


eee = X) 


o 


as tabulated by Pearson and Hartley. Some exact calculations indicate 
that the asymptotic values are adequate for samples of small to medium 
size as well. 


401 EARL R. RICH (University of California). A Stochastic Model 
for the Number of Beetles on the Surface of Flour. 


The problem considered is that of devising a stochastic model for 
the numbers of beetles on the surface of flour as a function of time. 
It arises in connection with analytical studies on the population dy- 
namics of the flour beetle Triboliwm confusum. 

All beetles start at time /) on the surface of the flour, and we assume 
that for all ¢: (a) the probability that a beetle on the surface at time f¢ 
makes a transition to the interior in an interval ¢ to (f + Af) is ); (b) the 
probability that a beetle in the interior at time ¢ makes a transition to 
the surface in an interval ¢ to (¢ + A?) is un. 

The probability that a beetle on the surface at ¢ = 0 will be on the 
surface at ¢ may be written 


m7 +. en edt: 
+p 


Considering n beetles and assuming them to act independently of one 
another we can write that X, , the number of beetles on the surface at 
time #, is-a binomial random variable with probability F(t, A, w). The 
parameters were estimated using two beetles. It was found that the 
assumption of independence is not tenable, the duration of stay in the 
states where both beetles are together being far in excess of that implied. 


F(t, r, h) 


NEWS AND ANNOUNCEMENTS 


Members are invited to transmit to their National or Regional Secretary 
(if members at large, to the General Secretary) news of appointments, 
distinctions or retirements and announcements of professional interest. 


DR. J. W. TREVAN, F.R:S. 

Dr. J. W. Trevan, formerly Director of the Wellcome Research 
Laboratories and temporary Director of research in the Wellcome 
Foundation Ltd., died on 13 October, 1956, aged sixty-nine. 

Dr. Trevan’s work in quantitative pharmacology at the Wellcome 
Laboratories in the 1920’s directed his attention to basic experimental 
and statistical considerations affecting bioassay. In this field he was 
an eminent pioneer and originator, and became particularly distin- 
guished for his contributions to the biological standardization of insulin. 
During his long career he received many scientific honours, including 
election as a Fellow of the Royal Society of London in 1946. He was 
a charter member of the Biometric Society, and the Society in general 
and its British Region in particular owed much to his continuing 
kindly interest. 

Dr. W. Edwards Deming was awarded the Shewart Medal for 1955 


by the American Society for Quality Control in recognition of his 
many contributions in this field. 


Dr. David B. Duncan has joined the staff of the Institute of Statistics, 
University of North Carolina, at Chapel Hill to participate in the 
teaching and research program of the Department of Biostatistics of 
the School of Public Health and the Statistics Department. He will 
have responsibility for the graduate programs in Biostatistics. 


Dr. Arnold H. E. Grandage has been appointed Associate Professor 
in the Department of Experimental Statistics, North Carolina State 
College. He will be in charge of the electronic computing laboratory, 
and will do teaching and research in industrial statistics. 


Dr. Eugene Lukacs, formerly with the U. 8. Office of Naval Research, 
is now professor at the Catholic University of America, Washington, 
D.C. . 
Dr. Frank J. Massey, Jr., is now Associate Professor of Biostatistics, 
School of Public Health, University of California at Los Angeles. 

Mr. John W. Mayne is now Director of Operational Research, 
Royal Canadian Navy. 


Dr. G. B. Oakland, after two years at Marischal College, University 
of Aberdeen, has returned to the Statistical Research and Service Unit, 
Science Service, Department of Agriculture, Ottawa, Canada. 
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Dr. Bernard Ostle, Professor of Mathematics, has been appointed 
Director of the recently formed Statistical Laboratory, Montana 
State College, Bozeman, Montana. 


Effective December 1, Professor J. A. Rigney will spend two years 
in Peru administering an Agricultural Research Program for which 
North Carolina State College has responsibility. 


Dr. Vincent Schultz is now Associate Professor of Biostatistics, 
School of Public Health, University of California, Los Angeles, Cali- 
fornia, U.S.A. 


Professor Hugh Fairfield Smith of the Institute of Statistics, North 
Carolina State College has been requested by the Statistical branch 
of the FAO of the United Nations to serve as agricultural statistician 
to the Philippines Government for one year, beginning October 1, 1956. 


It was wrongly announced in the September issue of this journal 
that Professor G. W. Snedecor was the recipient of an honorary Doctorate 
of Laws from the University of North Carolina. The degree actually 
conferred in recognition of his many outstanding services to statistics 
was an honorary Doctorate of Science, by North Carolina State College. 


INTERNATIONAL STATISTICAL INSTITUTE 


The Fifth Term of the International Statistical Education Centre, 
Beirut, opened on 25 October 1956 and will end on 10 May, 1957. 
For the Tenth term of ISEC, Calcutta, 16 July 1956 to April 1957, 
thirty students have been selected from Burma, Ceylon, India, Japan, 
Pakistan, the Phillippines, Singapore and Thailand. 


Mr. Ragnar Thorn has been appointed by the Organizing committee 
as Secretary General for the 30th I.S.I. Session which will be held in 
Stockholm, 8-15 August, 1957. 


INTERNATIONAL TRAVEL 


The National Science Foundation will award individual grants to 
defray partial travel expenses for a limited number of American scien- 
tists participating in the 30th Session of the International Statistical 
Institute or the Congress of the International Union for the Scientific 
Study of Population, scheduled to meet in Stockholm, Sweden, August 
8 to 15, 1957. Application blanks may be obtained from the National 
Science Foundation, Washington 25, D.C. Completed application 
forms must be submitted by March 1, 1957. 
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e PosTGRADUATE Drptoma CoursE 


The University of Aberdeen has established a postgraduate Diploma 
in Statistics, teaching for which will begin in October, 1957. Regula- 
tions are obtainable from the University. As the number that can be 
_ taken initially is severely limited, prospective students resident outside 
the British Isles in particular are urged to apply for admission not 
later than January 1957. The University Reader in Statistics is D. J. 
Finney, Se.D., F.R.S., F.R.S.E., Marischal College, Aberdeen, Scotland. 


CONTRIBUTED Papers ENAR ANNUAL MEETING 


Members wishing to contribute papers to the annual meeting of 
ENAR to be held in Atlantic City, N.J., 10-13 September 1957, should 
send abstracts by April 1, 1957 to 

Dr. Vincent Schultz, Program Chairman, 
Annual Meeting, Biometric Society ENAR, 
Agricultural Experiment Station, 

College Park, Maryland, U.S.A. 


THE BIOMETRIC SOCIETY 


ScIENTIFIC SESSIONS 


Brazilian Region 


At a meeting on 6 July 1956, held at Ouro Preto jointly with the 
Brazilian Association for the Advancement of Science, the following 
papers were presented: C. G. Fraga, Jr., The analysis of non-orthogonal 
experiments; J. S. Daniel, I. R. K. Abramof and T. Silva, The numbers 
of stomata in Coffea under different experimental conditions; A. Grosz- 
mann, Analysis of maize plant growth; 8. B. Henriques and F. R. 
Mandelbaum, Bioassay for the manometric determination of glutathion; 
F. P. Gomes, Factorial experiments in balanced incomplete blocks. 


Japan 


The Fourth Meeting (Third Spring Meeting) of the Biometric 
Society, Chapter of Japan, was held at Tokyo University on 12 April 
1956. Papers presented included: Statistical Analysis in Classification 
of Foxtail Millet (Sefaria ztalica) in Japan (S. Kitano); Comparison 
between the Efficiencies of the J.P. VI and U.S.P. XV Tests for Pyrogen 
by the Monte Carlo Method (8S. Kadokawa and S. Shintani); Studies 
on the Discrimination Method between the Chronic Hepatic and 
Non-hepatic Diseases with Application of the Linear Discriminant 
Function (K. Takahashi and E. Miyoshi); Precision of Variety Tests 
in Japan (T. Okuno); An Elementary Method of Construction of 
Punched Cards for P"- and Other Designs (M. Masuyama). 


OFFICERS 
General 


Beginning 1 January 1957, Dr. Allyn W. Kimball replaces Dr. C. I. 
Bliss as Treasurer of the Society. Dr. Kimball’s address is P. O. Box 
10088, Knoxville 19, Tennessee, U.S.A. 


WNAR 


At the Annual Meeting held in Seattle on 23 August 1956, the 
following were elected:—Regional President—D. G. Chapman; Regional 
Secretary-Treasurer—Elizabeth Vaughan; Executive Committee Mem- 
bers—W. J. Dixon, C. R. Li and Elizabeth Scott. 
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Regional and National Secretaries, and their current addresses are 
listed below. 


Regional Secretaries 


Australasian: Dr. G. D. Watson, Department of Statistics, The 
Australian National University, G.P.O. Box 4, Canberra, A.C.T. 
Australia. 

Belgian: Dr. L. Martin, 165 Ave. du Domaine, Forest, Bruxelles, 
Belgium. 

Brazilian: Dr. P. M. Freire, Instituto Biologico, Caixo Postal 7119, 
Sao Paulo, Brazil. 

British: Mr. E. C. Fieller, Ministry of Supply, Stat. Advisory Unit, 
Room 426, Shell Mex House, London, W.C.2, England. 

ENAR: Dr. A. M. Dutton, Box 287, Station 3, Rochester 20, 
N.Y., U.S.A. 

French: M. D. Schwartz, S.E.I.T.A., 2, Ave. d’Orsay, Paris 7°, 
France. 

German: Prof. Dr. Wilhelm Ludwig, Zoologisches Institut, Univ. of 
Heidelberg, Sofienstr. 6, Heidelberg, Germany. 

Italian: Dr. L. L. Cavalli-Sforza, Istituto Sieroterapico Milanese, 
via Darwin 20, Milano, Italy. 

WNAR: Miss Elizabeth Vaughan, 2325 Seventh Street, Bremerton, 
Washington, U.S.A. 


National Seeretaries 

Danish: Mr. N. F. Gjeddebaek, Skjoldagervej 25, Gentofte, Denmark. 

Indian: Dr. K. Kishen, Dept. of Agriculture, Uttar Pradesh, Luck- 
now (U.P.), India. 

Japanese: Mr. M. Hatamura, National Inst. of Agr. Research, 
Nishigahara, Kita-ku, Tokyo, Japan. 

Netherlands: Mr. E. van der Laan, Kornoeljestraat 60, ’s-Graven- 
hage, Netherlands. 

Swedish: Dr. H. A. O. Wold, Jirnbrogatan 18, Uppsala, Sweden, 

Swiss: Dr. H. L. LeRoy, Riedthofstrasse 34, Regensdorf, ZH. 


544 BIOMETRICS, DECEMBER, 1956 


INTERNATIONAL SYMPOSIUM ON BIOMETRICAL GENETICS, AND 
Fourtu INTERNATIONAL BIOMETRIC CONFERENCE, 
Orrawa, AuGcusT 28-31, 1958 


Plans for this Conference, and for the Symposium, sponsored by the Inter- 
national Union of Biological Sciences, are now in being and will be published in 
1957. Meanwhile, information on the tentative scientific program may be obtained 
from the secretary of the Society, Mr. M. J. R. Healy. Queries on other pertinent 
matters should be addressed to the chairman of the Local Arrangements Committee, 
named below. 


Symerostum INTERNATIONALE SUR LA BIOMETRIE GENETIQUE, ET 
QUATRIEME CONFERENCE INTERNATIONALE DE BIOMETRIE, 
Ortawa, 28-31 Aout, 1958 


Le programme de cette conférence, et du symposium (proposé par |’Union 
Internationale des Sciences Biologiques), est maintenant a l’étude et on le fera 
connaitre en 1957. On peut cependant obtenir dés maintenant des renseignements 
au sujet du programme scientifique prévu en s’addressant &4 M. M. J. R. Healy, 
Secrétaire de la société. Toutes les demandes concernant d’autres questions rela- 
tives 4 la Conference devront étre envoyées au Président du Comité d’Organisation 
locale (voir dessous). 


Symposium UBER BIoMETRISCHE GENETIKS, UND 
Vierte INTERNATIONALE BIOMETRISCHE KONFERENZ, 
Ortawa, AucusT 28-31, 1958 


Plane fiir diese Konferenz (und ftir das Symposium der Férderung der Inter- 
nationalen Vereinigung der biologischen Wissenschaften) sind in Verbereitung und 
werden im Laufe des Jahres 1957 veréffentlicht werden. Auskunft tiber das wissen- 
schaftliche Programm kann inzwischen vom Sekretar der Gesellschaft, Mr. M. J. R. 
Healy, erhalten werden. Anfragen von allgemeinem Interesse richte man an den 
Vorsitzenden des ‘Local Arrangements Committee”, (siehe unten). 


Smuposio pI GENETICA BIOMETRICA E 
QuARTA CONFERENZA INTERNATIONALE DI BIOMETRIA, 
Orrawa, 28-31 Agosto, 1958 


Questa conferenza ed il Simposio, ch’é sotto gli auspici dell’ Unione Internationale 
di Scienze Biologiche, sono adesso organizzati ed il programma sard pubblicato nell’ 
anno 1957. Frattanto, informazioni sul programma scientifico provvisorio possono 
essere ottenute dal segretario di questa Societa, Signor M. J. R. Healy. Questioni 


su altri dettagli dovrebbro essere indirizzate al Presidente del Comitato Locale— 
vedi sotto. 


Dr. G. B. Oakland 
Chairman, Local Arrangements Committee: Science Service Building 
Department of Agriculture 
Carling Avenue, 
Président du Comité d’Organisation locale: Ottawa 3, Ontario, 
Canada 
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229, 234, 339, 386 
statistical, 434 
stochastic, 155, 451 
Moments, 58, 265, 275, and see k 
statistics — 
generating function, 101, 266 
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method of, 101 
Morbidity, 154 
Mortality, 227 
Moving averages, 509, and see smoothing 
Multinomial distribution, 165, 211, 265 
Multivariate analysis, 67, 190, and see 
analysis of variance, discriminant 
function, matrices 
Mutation, lethal, 216 
Newman-Keuls test, see multiple range 
test 
Newton’s method, 80 
Neyman-Pearson theory, 31, and see 
estimation, inference 
Nomography, 477 
Nonfactorial experiments, 481 
Non-orthogonal design, 346, 367 
Normal equations, 193, 362, and see least 
squares, matrices 
Normality, 398 
Nutrition, 523, and see food technology 
Obituaries, 340 
Organolepsis, 127, and see comfort, fa- 
tigue, judging, scores, selection of 
judges 
flavor, 234, 381 
quality control, 234 
randomization in, 128 
serial, 234 
simultaneous, 234 
Orthogonality, 9, 262, and see com- 
parisons 
Orthogonal polynomials, 362, 498, and 
see Gegenbauer, Jacobi 
Pairing, see paired comparisons 
Palatability, see organolepsis 
Parasitology, 154 
Pascal problem, 227 
Paternity tests, 226, and see blood 
groups, genetics 
Path coefficients, 190, and see causation 
Percentages, see proportions 
Percentiles, 228 
Perception, see organolepsis 
Perennials, 330, and see tree crops 
Pharmacology, see bioassay, toxicology 
Physical science, see industrial research 
Physiology, 89, 233, 475, and see endo- 
crinology, medicine, threshold 
Plaid squares, 348 
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Plant spacing, 82, 350, and see plot size 
and shape 
Plot size and shape, 82, 226, 228, 333, 
and see guard rows, plant spacing, 
split plots, strip plots 
Poisson distribution, 213, 264, 280 
compound, 264 
truncated, 265 
Polya-Eggenburger distribution, 451 
Polytope, 20 
Populations, see census, distributions, 
fish, statistical genetics 
changes, 182, 230 
estimates, 163, 174, 463 
number of classes, 211, 454 
structure, 455 
Poultry, 107 
Power, 31, 34, and see tests 
Precision, 494, and see accuracy, infor- 
mation, least squares 
Prediction, 477, and see regression 
accuracy of, 478 
Preferences, see organolepsis 
Probability, 31, 211, and see runs 
Proportions, 83, 513, and see binomial 
Pseudo-factors, 246, 259 
Quality, see organolepsis 
Quality control, 465 
Quantification, see scales 
Quasifactors, see pseudo-factors 
Quasi-latin square, 245 
Radiology, 89 
Random dice, 450 
Randomer, 450 
Randomization, 128, 335, 450, 480, and 
see selection 
Randomized blocks, 348, and see incom- 
plete blocks 
Random mating, 206 
Range, see tests using range 
significant studentized 
Ranks, 128, 301, 399, and see scores, 
transformations 
Recurrence formulas, 203 
Regression, see adjusted means, canon- 
ical analysis, correlation, covariance, 
fitting regression line, orthogonal 
polynomials, trend 
additional variable, 232 
adjustment, 23 
analysis, 174, and see analysis of co- 


variance 
asymptotic, 323 
deviations from, 486 
error, 363 
independent variable affected by treat- 
ments, 452 
model, 226, 230, and see structure 
multiple, 192, 362 
standardized, 193 
weighted, 174 
Rejection of data, 84, 499, and see miss- 
ing values, selection 
Repeatability, 84, and see genetic corre- 
lation, heritability 
Replication, 
fractional, 1, 259, and see aliases in 
mixed series, 1 
in time, 229 
Research in statistical methods, 449 
Residuals, see errors 
Respiration, 89 
Response, see bioassay, dose response 
curve, models, time response curve 
critical, 75 
graded, 72 
quantal, 72, 512 
Result-guided procedures, see tests of 
significance of results suggested by 
data 
Runs, 227 
Sample size needed, 143, 179, 213, 231, 
287 
Samples, small, 264 
Sample surveys, 449, 462, 
sampling 
error control, 462 
interviewer training, 464 
Sampling, 143, 449, and see components 
of variance, design of experiments, 
sample size needed, sample surveys 
ecological, 154, 182, 453 
error, see variance 
multistage, 229 
nested, 234, 434 
stratified, 283, 315 
studies of statistical problems, 106, 
175, 264, 290, 511 
surveys, 331 
time series, 231 
Scales, 87, 381, 398, 416, and see scores 
Scores, 127, 394, and see discriminant 


and see 
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function, organolepsis, ranks, scales 
matrix of, 400 
tournament, see Kendall matrix 
Screening tests, 311, and see bioassay 
Selection, see choice of transformation, 
design of experiments, randomiza- 
tion, rejection of data, sampling 
natural, 216 
of dosages, 312, 323, 491 
of estimate, 222 
of experimental units, 333 
of interviewers, 464 
of judges, 393 
of levels, 312, 323 
of method of statistical analysis, 67, 
387 
of order of treatments, 509 
of variates, 68 
Sensitivity data, see quantal response 
Sensory tests, see organolepsis 
Sequential experiments, 229, 283 
Smoothing, 161, and see moving averages 
Soil heterogeneity, 228, 362 
Split plots, see strip plots 
covariance adjustments, 23 
tests of significance, 23 
Standard deviation, see variance 
Standard error, see variance 
Standard treatment, 491 
Statistical control, see analysis of co- 
variance 
Statistics texts and periodicals, 449 
Steepest ascent (descent), method of, 21 
Stirling’s approximation, 170 
Stochastic processes, 57 
Stratification, see sampling 
Strip plots, 347, and see split plots 
Structural analysis, 445 
Structure, 31, 226, 230 
Subjective evaluation, see judging, or- 
ganolepsis 
Successive approximation, see iteration 
Sufficient statistics, 213, and see effi- 
ciency, estimation 
Summary tables, 355 
Surveys, 331, and see sample surveys 
Survival curve, see dose response curve, 
time response curve 
Survival time, see time response curve 
Systematic designs, 228 
Tables, miscellaneous, 54, 279, 449 
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graphical, 168, 172, 220 
Taste tests, see organolepsis 
Teaching of statistics, 93 
methods, 95 
Tests, 449, and see analysis of variance, 
chi square, confidence limits, F test, 
least significant difference, null hy- 
pothesis, ranks, rejection of data 
a posteriori, 230 
approximate, 224 
biased, 224 
correlated, 29, 69 
exact, 224, 264 
likelihood ratio, 264 
logic of, 31 
multiple comparisons, 230 
multiple range, 307, and see multiple 
F test 
Duncan’s, 307 
normal deviate, 272 
of significance, 354 
of significance of 
difference between adjusted 
means, 23 
difference between means, 230 
difference between proportions, 
283 
results suggested by data, 358 
variance, 264 
“zero” class, 268, 282 
studentized, 230 
using range, 230 
Theory, see biometry, hypothesis, model 
Threshold, 491 
Time response curve, 73, and see pes 
response curve 
Time series, 334 
Tolerance, 72 
Toxicology, 227, 233, 515, and see bio- 
assay, pharmacology 
Transformations, 85, 386, and see addi- 
tivity, analysis of variance, bioas- 
say, canonical analysis, logistic 
curve, matrices, models 
angular, 513 
choice of, 84, 513 
effect of, 123 
logit, 74, 513 
normal deviate, 513 
probit, 74, 513 
squared hyperbolic secant, 399 
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square root, 451 

Tree crops, 334, 345, and see long-term 
experiments 

Trend, 334, 362, and see regression 

Trigamma function, 170 

Twins, 85 

Uniformity data, 481 

Variance, see covariance, F test, genetics, 
mean deviation 

analysis of, 9, 41, 68, 84, 100, 142, 250, 

345, 369, 428, 447, 450, 481, 507, 
and see additivity, analysis of co- 
variance, chi square, combination of 
data, components of variance, de- 
grees of freedom, disproportionate 
subclasses, errors, fitting constants, 
F tests, least significant difference, 
least squares, long-term  experi- 
ments, missing valués, models, mul- 
tiple F tests, multivariate analysis, 
orthogonal polynomials, path coeffi- 
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423, 505, and see components of 
covariance, structural analysis 
computation of, 339 
confidence limits for, 450 
interpretation of, 505 
variance of, 433 
conditional, 273 
error, 68, 445, 502 
exact, 359 
expectation, 99 
homogeneity of, 83, 265, 279, 391, 398 
negative, 445 
of adiusted mean, 251, 366, 372 
of difference of adjusted means, aver- 
age, 34 
of estimate, 218 
of frequency, 273 
of mean, 143 
ratio, confidence limits for, 99 
related to mean, 82, 85 
sampling, 169, 180 


cients, regression, structural analy- Variate, fixed, 31 
sis, tests, transformations, uniform- Variate, random, 31 


- ity data Vitamins, 523 
computation of, 110, 345, 390, 498 ascorbic acid, 499 
of percentages, 85 - By, 501 


: Weighting, 25, 123, 325, 371, 434, 513 
Working angle, 514 
maximal, 516 
Working probit, 513 
Youden squares, 451 
Zero, see differences of zero, tests of 
significance of “zero” class 
A gl see F test 


